next up previous index
Next: Generating new sequences Up: Structure of the model Previous: The grouping algorithm   Index



Learning temporal relationships between groups

We now want to learn the structure of the sequence of sub-trajectory groups. In order to do that, we use a variable length Markov model introduced by Ron et al. [11] in the context of learning a sequence of letters in English texts. The idea of this model is to use a memory of length varying with the context. In particular, when the trajectory splits into two trajectories, we need to know where the sub-trajectories are coming from, in order to decide what is the next sub-trajectory to infer. A memory of a long sequence of sub-trajectory groups is needed in that case. On the other hand, if group $A$ is always followed by group $B$, we need only a short memory (stating that if the current group is $A$, then $B$ is next).

The probabilities of sequences of sub-trajectory groups are stored in a tree. Each branch of the tree corresponds to a sequence. The tree is constructed recursively by adding sequences. A sequence $l$ is added to the tree if and only if its probability is greater than a previously chosen threshold and adding the sequence $l$ gives a statistically different tree. Two trees are said statistically different if the probability distributions encoded by the trees, weighted by the probability of the sequence $l$, are different. Thus we need a measure of distance between probability distributions. The Kullback-Leibler measure is usually used with the variable length Markov model. However, by trying the model on the generation on English texts, we found that the Matusita measure gives better results on long texts (about 33% of letters were successfully predicted against 26% for the Kullback-Leibler measure). The Matusita measure is given by:

\begin{displaymath}D(p\Vert q)=\int \left(\sqrt{p(x)}-\sqrt{q(x)}\right)^2 \; dx\end{displaymath}

for two probability distributions $p$ and $q$. $p(x)$ is the probability of observing a group $x$ given the sequence $l$. $q(x)$ is the probability of observing a group $x$ given the sub-sequence of $l$ that represents a shorter memory already encoded in the tree. The statistical difference tells us whether there is any advantage in using longer memory for the current context. The algorithm stops when every sequence of length less than a predefined threshold has been inspected.


next up previous index
Next: Generating new sequences Up: Structure of the model Previous: The grouping algorithm   Index

franck 2006-10-01