|
ORGANIZERS
| Tim van Erven, CWI, Amsterdam |
Tim dot van dot Erven at cwi dot nl |
| Peter Grünwald, CWI, Amsterdam |
pdg at cwi dot nl |
| Petri Myllymäki, University of Helsinki |
Petri dot Myllymaki at cs dot helsinki dot fi |
| Teemu Roos,
HIIT, Helsinki |
Teemu dot Roos at cs dot helsinki dot fi |
| Ioan Tabus, Tampere University of Technology
|
tabus at tut dot fi |
MOTIVATION
During the last few years (2004-2007), there have been
several breakthroughs in the area of Minimum Description Length (MDL)
modeling, learning and prediction. These breakthroughs concern the
efficient computation and proper formulation of MDL in parametric
problems based on the "normalized maximum likelihood", as well as
altogether new, and better, coding schemes for nonparametric problems.
This essentially solves the so-called AIC-BIC dilemma, which has been
a central problem in statistical model selection for more than 20
years now. The goal of this workshop is to introduce these exciting
new developments to the ML and UAI communities, and to foster new
collaborations between interested researchers.
Most new developments that are the focus of this workshop concern
efficient (in many cases, linear-time) algorithms for theoretically
optimal inference procedures that were previously thought not to be
efficiently solvable. It is therefore hoped that the workshop will
inspire original practical applications of MDL in machine learning
domains. Development of such applications recently became a lot
easier, because of the new (2007) book on MDL by P. Grünwald [2],
which provides the first comprehensive overview of the field, as well
as in-depth discussions of how it relates to other approaches such as
Bayesian inference. Remarkably, the originator of MDL, J. Rissanen,
also published a new monograph in 2007; and a Festschrift in Honor of
Rissanen's 75th birthday was presented to him in May 2008.
PROGRAM OVERVIEW
(preliminary)
We start the workshop with a 1-hour
tutorial by Peter Grünwald, with particular emphasis on the
breakthroughs mentioned above. This will be followed by invited
sessions. The workshop will be concluded with a panel discussion.
- 09:00–10:10
- Peter Grünwald: MDL tutorial.
abstract
| PDF
- 10:10–10:30
- questions & discussion
- 10:30–11:00
- coffee break
- 11:00–12:30
-
Petri Myllymäki: Fast computation of NML for Bayesian networks [5, 6].
abstract
| PDF
Steven de Rooij: Nonparametric density estimation by switching [1].
abstract
| PDF
Janne Ojanen: Extensions to MDL denoising [10].
abstract
| PDF
- 12:30–14:30
- lunch break
- 14:30–16:00
-
Tomi Silander: Sequential and factorized NML models [3, 4].
abstract
| PDF
Tong Zhang: Generalization theory of two-part code MDL estimator.
abstract
| PDF
Ioan Tabus: Normalized maximum likelihood models in genomics [8, 9].
abstract
| PDF
- 16:00–16:30
- coffee break
- 16:30–17:00
-
Matthias Seeger: Information consistency of nonparametric Gaussian process
methods [7].
abstract
| PDF
- 17:00–17:30
- panel discussion
- 18:00–20:00
- reception
INVITED SPEAKERS (confirmed)
- Peter Grünwald (CWI, Amsterdam)
- Petri Myllymäki (University of Helsinki)
- Steven de Rooij (EURANDOM, Eindhoven)
- Janne Ojanen (Helsinki University of Technology)
- Tomi Silander (University of Helsinki & HIIT)
- Tong Zhang (Rutgers University)
- Ioan Tabus (Tampere University of Technology)
- Matthias Seeger (Max Planck Tuebingen)
RELATED PUBLICATIONS
- T. van Erven and P.D. Grünwald and S. de Rooij.
Catching up faster
in Bayesian model selection and model averaging. Advances in Neural
Information Processing Systems 20 (NIPS 2007)
- P.D. Grünwald,
The Minimum Description Length Principle. MIT Press, June 2007.
570 pages.
- J.Rissanen, and T.Roos, (2007).
Conditional NML universal models, 2007
Information Theory and Applications Workshop (ITA-07), pp. 3337-341.
- T. Roos, T. Silander, P. Kontkanen, and P. Myllymäki, (2008).
Bayesian network structure learning using factorized NML universal models,
2008 Information Theory and Applications Workshop (ITA-08).
- P. Kontkanen and P. Myllymäki,
A linear-time
algorithm for computing the multinomial stochastic complexity.
Information Processing Letters
103 (2007) 6 (September), 227-233.
- P. Kontkanen and P. Myllymäki,
MDL histogram
density estimation.
In Proc. 11th International Conference on Artificial
Intelligence and Statistics (AISTATS 2007), Puerto Rico, March 2007.
- M. Seeger, S. Kakade, D. Foster,
Information Consistency of Nonparametric Gaussian Process Methods
IEEE Transactions on Information Theory 54(5), 2008, 2376-2382.
- I. Tabus, G. Korodi, Genome
compression using normalized maximum likelihood models for constrained
Markov sources, IEEE Information Theory
Workshop, Porto, Portugal, May 5-9, 2008.
- Y. Yang, I. Tabus,
Haplotype block partitioning using a normalized
maximum likelihood model, in Proc. IEEE International Workshop on Genomic
Signal Processing and Statistics, Tuusula, Finland, June 10-12, 2007.
- V. Kumar, J. Heikkonen, J. Rissanen, and K. Kaski.
Minimum description length denoising with histogram models.
IEEE Transactions on Signal Processing 54(8), pages 2922-2928, 2006.
|
|