The feasibility of the MDL model selection
criterion is illustrated by the following demonstration. We use the
criterion for estimating the appropriate order of a Markov chain.
Recall that a Markov chain is a process that generates symbols where the
probability distribution of each symbol depends on the previous symbols
only through the k immediate predecessors. The value k is
called the order of the chain. Hence, for instance, in a
0-order Markov chain the symbols are independent of each other.
By typing an arbitrary sequence of characters in the
text area below you can see code-lengths for orders 1 to 12. The
code-length for each order is obtained from a mixture code with a
Dirichlet(1/2, ..., 1/2) prior on the transition probabilities. Except
for the strings consisting of one character repeated n times,
e.g., 0000000, the code-lengths of the mixture code are very close to
the code-lengths of the NML code.
The model(s) giving the shortest code-length to the typed sequence is
shown in green.