Learning temporal relationships between groups

Next: Generating new sequences Up: Structure of the model Previous: The grouping algorithm Index

Learning temporal relationships between groups

We now want to learn the structure of the sequence of sub-trajectory groups. In order to do that, we use a variable length Markov model introduced by Ron et al. [11] in the context of learning a sequence of letters in English texts. The idea of this model is to use a memory of length varying with the context. In particular, when the trajectory splits into two trajectories, we need to know where the sub-trajectories are coming from, in order to decide what is the next sub-trajectory to infer. A memory of a long sequence of sub-trajectory groups is needed in that case. On the other hand, if group is always followed by group , we need only a short memory (stating that if the current group is , then is next).

The probabilities of sequences of sub-trajectory groups are stored in a tree. Each branch of the tree corresponds to a sequence. The tree is constructed recursively by adding sequences. A sequence is added to the tree if and only if its probability is greater than a previously chosen threshold and adding the sequence gives a statistically different tree. Two trees are said statistically different if the probability distributions encoded by the trees, weighted by the probability of the sequence , are different. Thus we need a measure of distance between probability distributions. The Kullback-Leibler measure is usually used with the variable length Markov model. However, by trying the model on the generation on English texts, we found that the Matusita measure gives better results on long texts (about 33% of letters were successfully predicted against 26% for the Kullback-Leibler measure). The Matusita measure is given by:

$\begin{displaymath}D(p\Vert q)=\int \left(\sqrt{p(x)}-\sqrt{q(x)}\right)^2 \; dx\end{displaymath}$

for two probability distributions

and

is the probability of observing a group

given the sequence

is the probability of observing a group

given the sub-sequence of

that represents a shorter memory already encoded in the tree. The statistical difference tells us whether there is any advantage in using longer memory for the current context. The algorithm stops when every sequence of length less than a predefined threshold has been inspected.

Next: Generating new sequences Up: Structure of the model Previous: The grouping algorithm Index

franck 2006-10-01