Comparison of trees built using the Kullback-Leibler divergence

Next: Comparison of trees built Up: Comparison of the Lidstone Previous: Comparison of trees built Index

Comparison of trees built using the Kullback-Leibler divergence

We repeated the experiment using the KL divergence to compare probability densities. Figures 6.7, 6.8 and 6.9 show the resulting trees.

**Figure 6.7:** VLMM tree learnt using the maximum likelihood estimation of probability and the Kullback-Leibler divergence. $\epsilon$ is set to 0.003 and only the probabilities that are greater than 0.003 are shown on the graph.
$\includegraphics[width=145mm,keepaspectratio]{ranktrees/t_vel_0_0.003.eps}$

**Figure 6.8:** VLMM tree learnt using the Lidstone estimation of probability with $\lambda =0.5$ and the Kullback-Leibler divergence. $\epsilon$ is set to 0.003 and only the probabilities that are greater than 0.003 are shown on the graph.
$\includegraphics[width=145mm,keepaspectratio]{ranktrees/t_vel_0.5_0.003.eps}$

**Figure 6.9:** VLMM tree learnt using the Laplace estimation of probability and the Kullback-Leibler divergence. $\epsilon$ is set to 0.003 and only the probabilities that are greater than 0.003 are shown on the graph.
=10cm $\includegraphics[width=145mm,keepaspectratio]{ranktrees/t_vel_1_0.003.eps}$

The method of estimation of probability seems to exert less influence on the result. In the case of the maximum likelihood estimate the tree does not change significantly compared to the corresponding case in the previous section. The two other trees have grown.

The aim of the variable length Markov model is to reduce the number of links required to model the probability distribution. The size of the tree influences directly the learning because the more nodes there are in the tree, the more nodes the learning algorithm has to check. So it is possible that the KL divergence gives us a less efficient tree.

Due to the small amount of data used to construct these trees, the learning using the KL divergence in the two last cases can give such a tree because the text has been over learnt. In order to make sure that it is not the case, a further experiment has been done.

Next: Comparison of trees built Up: Comparison of the Lidstone Previous: Comparison of trees built Index

franck 2006-10-01