The Kullback-Leibler divergence

Next: The Matusita distance Up: Comparison of the probability Previous: Comparison of the probability Index

The Kullback-Leibler divergence

This measure of distance between probabilities is well known in the area if information theory. Let and describe two measure of probabilities, the Kullback-Leibler divergence is given by :

$\begin{displaymath}D_{KL}(p\Vert q)=\int p(x) \ln \left( \frac{p(x)}{q(x)} \right) \; dx\end{displaymath}$

where the variable

describe the entire space. As here we have modeled the probabilities $\tilde{P}$ by discrete values on a known alphabet $\Sigma$ , we have:

$\begin{displaymath}D\left(\tilde{P}(\cdot\vert\sigma s)\Vert\tilde{P}(\cdot\vert... ...lde{P}(\sigma'\vert\sigma s)}{\tilde{P}(\sigma'\vert s)}\right)\end{displaymath}$

Thus :

$\begin{displaymath}Err(\sigma s,s)=\sum_{\sigma'\in \Sigma}\tilde{P}(\sigma s \s... ...')\tilde{P}(s)}{\tilde{P}(\sigma s)\tilde{P}(s \sigma')}\right)\end{displaymath}$

This measure of a distance between probabilities has been used in [51] together with the Laplace's law of succession in order to correct corrupted texts using a variable length Markov model.

We can also notice in the computation of the error measure, that the sum has $\vert\Sigma\vert$ terms and so the computation of a distance measure is computationally expansive.

franck 2006-10-16