next up previous index
Next: Number of guesses required Up: Quantitative assessments of the Previous: Assessment of the prediction   Index



Assessment of the prediction with a small number of data

The experience has been done a second time, but this time on the text $T_3$ which is the smallest one. Results are reported in tables 6.3 and 6.4 for the Matusita distance and the KL divergence respectively.


Table 6.3: Comparison of prediction of a VLMM learned from a short text using the Matusita distance. The VLMM has been trained using the text $T_3$. The case $\lambda=1$ completely failed since it only predicted spaces.
$\lambda$ $T_1$ $T_2$ $T_3$
0 $20.2\%$ $22.1\%$ $30.8\%$
0.05 $23.7\%$ $25.5\%$ $35\%$
0.1 $24.0\%$ $24.8\%$ $32.7\%$
1 $16.6\%$ $16.5\%$ $18.2\%$



Table 6.4: Comparison of prediction of a VLMM learned from a short text using the KL distance. The VLMM has been trained using the text $T_3$.
$\lambda$ $T_1$ $T_2$ $T_3$
0 $20.9\%$ $23\%$ $31.6\%$
0.5 $28.3\%$ $30.1\%$ $52.4\%$
1 $28.3\%$ $30.1\%$ $52.4\%$


We can see on these tables that the KL divergence performs better this time. Associated with the Laplace's law of succession, it is able to predict about $30\%$ of a text dealing with another subject ($T_1$ and $T_2$).

In the case of the Matusita distance, the best performance is achieved with $\lambda =0.05$. So it seems that this time, the maximum likelihood estimate could not explain the data by its own. A small amount of uniform prior was needed in this case. This is due to the fact that there are not a lot of training examples. We cannot trust the frequencies counted in the training sequence as we could with the training on a large dataset, so we need to include a part of uniform estimation which is done by increasing the value of $\lambda$.


next up previous index
Next: Number of guesses required Up: Quantitative assessments of the Previous: Assessment of the prediction   Index

franck 2006-10-16