next up previous index
Next: Learning temporal relationships between Up: Grouping similar sub-trajectories Previous: The model of a   Index



The grouping algorithm

We experimented with two grouping algorithms. The first is a greedy algorithm. It first assumes that each sub-trajectory given by the segmentation initially forms a group by itself. Let ${\cal{S}}$ be this initial set of groups.

We want to know how to merge groups so that the resulting groups are sufficiently coherent that a Gaussian model is a good representation.

For every pair $(g_i,g_j)$ of elements of ${\cal{S}}$, we compute the variance of the group $g_{i \cup j}$ that is built using the sub-trajectories from both $g_i$ and $g_j$. We select the pair of groups $(g_a,g_b)$ that gives the lowest variance and merge those two groups to form only one group. We delete $g_a$ and $g_b$ from ${\cal{S}}$ and insert $g_{a \cup b}$ instead.

We iterate the process until we reach a given number of clusters.

Unfortunately this algorithm becomes quite inefficient if we want to process a large number of sub-trajectories (several hundreds in practice). Its complexity is ${\cal{O}}\left(p^4\right)$ where $p$ is the total number of sub-trajectories.

For larger training sets of sub-trajectories, we use the normalised cuts algorithm of Shi and Malik [12]. This computes groups by analysing the eigenvectors of a $p \times p$ similarity matrix. The computation of this matrix is in ${\cal{O}}\left(p^2 S(p)\right)$ where $S(p)$ is the complexity of the similarity measure. The complexity of the solution to the eigenvalue problem usually has a complexity of ${\cal{O}}\left(p^3\right)$ but fortunately, for a sparse matrix, it can be reduced to a lower complexity, using the Lanczos algorithm.

We tried this algorithm with two similarity measures, one is the euclidian distance, and the other one is based on dynamic time warping. Both have a complexity of ${\cal{O}}\left(p\right)$ which brings the complexity of the whole algorithm to ${\cal{O}}\left(p^3\right)$. The results are slightly better for the euclidian distance because the dynamic time warping measure is invariant to transformations such as rotations or translations. The principal component analysis used to compute equation 1 performs poorly if we use a similarity measure which is not strict enough on the allowed transformations.

The results, even if still acceptable, are visually less effective than those from the greedy algorithm. However, the normalised cuts method scales better for a large number of sub-trajectories.


next up previous index
Next: Learning temporal relationships between Up: Grouping similar sub-trajectories Previous: The model of a   Index

franck 2006-10-01