This chapter has described the theory behind the variable length Markov model and the learning algorithm. We can notice that one problem of the VLMM is the choice of . It would be better to be able to estimate it and thus to have a parameter free algorithm.
Tino and Dorffner [32] try to solve this problem by using a completely different way of constructing the tree in the learning algorithm. They use a metric that represents how close two sequences are. This metric demands a parameter that can be interpreted as a learning parameter used in the same way as the learning parameter of a temporal difference learning [45] is used. A vector quantization is then used to cluster sub-sequences into clusters that share the same suffix structure according to the metric. Unfortunately this approach does not eliminate the use of a parameter. However, it may be easier to find the right parameter for this learning algorithm. Experiments show that the result is some times better than the VLMM algorithm and sometimes equivalent.
Different measures of probability and different distances to compare probabilities have been exposed. The next chapter presents the results of these comparison applied to the modelling of letter sequences. It also shows some examples of generated text.