Assessment of the prediction with a small number of data

Next: Number of guesses required Up: Quantitative assessments of the Previous: Assessment of the prediction Index

Assessment of the prediction with a small number of data

The experience has been done a second time, but this time on the text which is the smallest one. Results are reported in tables 6.3 and 6.4 for the Matusita distance and the KL divergence respectively.

Table 6.3: Comparison of prediction of a VLMM learned from a short text using the Matusita distance. The VLMM has been trained using the text

. The case $\lambda=1$ completely failed since it only predicted spaces.

$\lambda$
0	$20.2\%$	$22.1\%$	$30.8\%$
0.05	$23.7\%$	$25.5\%$	$35\%$
0.1	$24.0\%$	$24.8\%$	$32.7\%$
1	$16.6\%$	$16.5\%$	$18.2\%$

Table 6.4: Comparison of prediction of a VLMM learned from a short text using the KL distance. The VLMM has been trained using the text

$\lambda$
0	$20.9\%$	$23\%$	$31.6\%$
0.5	$28.3\%$	$30.1\%$	$52.4\%$
1	$28.3\%$	$30.1\%$	$52.4\%$

We can see on these tables that the KL divergence performs better this time. Associated with the Laplace's law of succession, it is able to predict about $30\%$ of a text dealing with another subject ( and ).

In the case of the Matusita distance, the best performance is achieved with $\lambda =0.05$ . So it seems that this time, the maximum likelihood estimate could not explain the data by its own. A small amount of uniform prior was needed in this case. This is due to the fact that there are not a lot of training examples. We cannot trust the frequencies counted in the training sequence as we could with the training on a large dataset, so we need to include a part of uniform estimation which is done by increasing the value of $\lambda$ .

Next: Number of guesses required Up: Quantitative assessments of the Previous: Assessment of the prediction Index

franck 2006-10-16