next up previous index
Next: The results Up: Grouping the pathlets Previous: The normalised cut algorithm   Index




The greedy algorithm

As we show later in this chapter, the algorithm based on the dynamic time warping and the normalised cut sometimes groups pathlets that look different. This results in the creation of a poor pathlet models. This problem is due to the fact that we are comparing pathlets in pairs using a similarity measure that is not necessarily compatible with our grouping model. A global similarity measure for several pathlets at a time would be much more appropriate since it can ensure that a group is coherent, a property that is not guaranteed with a similarity measure for only two pathlets. A global similarity measure leads to a groupwise clustering rather than the pairwise clustering of the dynamic time warping and normalised cut algorithm.

Furthermore, we want the similarity measure to be compatible with our grouping model. One logical choice for the similarity measure is the variance of the group of pathlets. Indeed several pathlets are properly grouped if the variance of the group is small. A large variance denotes a group containing pathlets that look different to each other.

Unfortunately, we cannot use the same clustering algorithm. The normalised cut algorithm is based on the use of a similarity matrix, which cannot easily be created for a groupwise similarity measure. Instead, we use a greedy algorithm to group the pathlets. At each step, the algorithm makes the best decision for the grouping.

The algorithm first assumes that each pathlet given by the trajectory segmentation initially forms a group by itself. Let ${\cal{S}}$ be this initial set of groups.

For every pair $(g_i,g_j)$ of elements of ${\cal{S}}$, we compute the variance of the group $g_{i \cup j}$ that is built using the pathlets from both $g_i$ and $g_j$. We select the pair of groups $(g_a,g_b)$ that gives the lowest variance and merge those two groups to form only one group. We delete $g_a$ and $g_b$ from ${\cal{S}}$ and insert $g_{a \cup b}$ instead.

We iterate the process until we reach a given number of clusters $n_c$.

We can also stop the algorithm if we reach a given variance so that we can control the quality of the model. However, if we set this variance too low, we stop the algorithm too quickly and most of the resulting groups will only contain one pathlet, making the use of pathlet models superfluous in our framework. We prefer to control the number of pathlet models so that we can control the computational complexity of the remaining steps in the framework.

Figure 5.11 summarises the greedy algorithm.

Figure 5.11: The greedy algorithm.
\begin{figure}\begin{center}
\framebox{
\shortstack[l]{
Create a pathlet group s...
...{\tabfig}Remove $g_a$ and $g_b$ from ${\cal{S}}$
}
}
\end{center}
\end{figure}


next up previous index
Next: The results Up: Grouping the pathlets Previous: The normalised cut algorithm   Index

franck 2006-10-01