On-line MDL Demonstrations

Markov Chain Order Estimation

The feasibility of the MDL model selection criterion is illustrated by the following demonstration. We use the criterion for estimating the appropriate order of a Markov chain. Recall that a Markov chain is a process that generates symbols where the probability distribution of each symbol depends on the previous symbols only through the k immediate predecessors. The value k is called the order of the chain. Hence, for instance, in a 0-order Markov chain the symbols are independent of each other.

By typing an arbitrary sequence of characters in the text area below you can see code-lengths for orders 1 to 12. The code-length for each order is obtained from a mixture code with a Dirichlet(1/2, ..., 1/2) prior on the transition probabilities. Except for the strings consisting of one character repeated n times, e.g., 0000000, the code-lengths of the mixture code are very close to the code-lengths of the NML code.

The model(s) giving the shortest code-length to the typed sequence is shown in green.

Links to Demonstrations/Software on the Web

The Statistical Data Viewer (by Volker Nannen)

The LExAu project (by Jeroen van Maanen)