next up previous index
Next: The pruning of nodes Up: The learning algorithm Previous: The learning algorithm   Index



General idea

The idea behind the learning algorithm is that for different sequences, we need different lengths of memory for the prediction of the next element. For instance, if the learning has been done with many instances of sequences such as {a,c,b,d,e,f}, {e,c,b,d,e,f} or {d,c,b,d,e,f} then we would like to use a memory of four letters: {c,b,d,e} in order to predict the next letter ``f''. If only sequences like {a,l,k,j,e,f}, {s,d,l,l,e,f} or {s,l,o,a,e,f} occurs, then we want to use only the necessary information to predict ``f'' that is the sequence of one letter: {e}. We do not model the patterns in the sequence that are not important for the prediction.

In a more formal way, we wish to learn a model of the distribution of probabilities of strings in the sequence as suggested by the figure 6.2(b) but with a minimum number of nodes. The idea is to build a tree that gives the same probability values as a previously chosen probability measure on the learning sequence. The section 6.3.3 describes the probability measures that we have tried on a simplified example and how they performed. For now we can use the notation $\tilde{P}$ to denote this probability measure.


next up previous index
Next: The pruning of nodes Up: The learning algorithm Previous: The learning algorithm   Index

franck 2006-10-01