For these experiments, the VLMM has been trained on a text (2144 letters). The training text tells the story of penguins and other animals.
Figures 6.3, 6.4, 6.5 and 6.6 represents the resulting tree when trained using the Matusita distance. The Lidstone estimation of probability has been used with varying from 0 to 1. The case
corresponds to the maximum likelihood estimate and the case
correspond to Laplace's law of succession.
![]() |
![]() |
=10cm
![]() |
=5cm
![]() |
We can see that the topology of the tree evolves a lot from the case to the case
.
For the case (figure 6.6), the depth of the tree is one. That means that the model does not take histories into account. It only models the probability of having a letter in the alphabet. We can notice that some letters like ``j'' or ``q'' are not in the tree because of their low probability in English texts, and in particular in our example text. As we have seen before, this is the problem with Laplace's law of succession because it is built on the assumption of a uniform prior.
In the case (figure 6.3), the tree seems to have learnt the text in a more appropriate manner. We can find parts of some words like ``peng'' which comes from ``penguin'' (main subject of the text). We can also find small words such as ``the'', ``of'', ``in'', ``on'' or ``as''.
The two other cases are a transition between those two extremes.
This qualitative evaluation seems to be favorable to the maximum likelihood estimation. A quantitative evaluation of these trees is given later.