As variable length Markov models (VLMM) have been designed to learn text, that is sequence of letters, it is easier to understand if we keep the original terminology and map it to our problem. So in the next sections, we will use the following definitions.
An alphabet is a set of predefined distinct entities. In the case of sequence of letters, the entities would be letters. In the case of sequence of face, the entities would be the prototypes as defined in 4.3. We will refer to these entities as being letters to stick to the original definitions given by Ron et al. [51].
denotes the size of the alphabet.
A string is a sequence of letters and is denoted by
where
is the length of the sequence. We denote by
the empty string.
is the set of all strings over
,
is the set of all strings of length
over
, and
is the set of all strings of length at most
over
.
A prefix of a string of length
is denoted by
. In the same manner, a suffix of a string
of length
is denoted by
. The set of all suffixes of a string
of length
is
.
A string is a suffix extension of
if and only if
is a suffix of
, that is
.