Grammatical Inference: Learning Automata and Grammars
About the topic
Grammatical inference is about learning automata and grammars given information about the language.
During this tutorial we will introduce the problem and its applications, study the various paradigms and
related learnability results and discuss some of the most important algorithms in the field.
About the teacher
Colin de la Higuera got his PhD at Bordeaux University, France in 1989. He has been associate Professor at the University of Montpellier,
Professor at Saint-Etienne University and is now Professor at Saint-Etienne University.
He has been involved in a number of research themes, including algorithmics, formal language theory, pattern recognition.
His chief interest lies in grammatical inference, a field in which he has been active in the last 10 years.
He is the author of a number of research papers in this fiels and of a monograph, “Grammatical Inference: Learning Automata and Grammars”, published in 2010.
He has developed algorithms, studied learning models and has been trying to link classical formal language frameworks (using the Chomsky hierarchy) with alternative ways of
defining languages, inspired by linguistic considerations or techniques developed in pattern recognition.
He has been chairman of the International Community in Grammatical Inference (2002-2007) and is now in charge of the curriculum development programme in the European Network PASCAL 2.
The course is given in Zadar, Croatia, from the 23rd till the 27th of August 2010.
Outline of the course and material
- Introduction. (slides, pdf) Motivations and applications. One extensive example. Convergence and complexity issues.
- Learning from an informant. (slides, pdf) Complexity issues. State merging algorithms. Gold's algorithm. RPNI. EDSM.
- Learning from text. (slides, pdf) Identification from text is difficult. Different negative results. Learning k-testable machines and k-reversible languages.
- Learning probabilistic finite automata. (slides, pdf) What are probabilistic finie automata? How to use them.
Distances between distributions. Evaluation issues. Algorithms Alergia, DSAI and MDI.
- Learning context-free grammars. (slides, pdf) Why the problem is complicated. A presentation of a number of algorithms and heuristics.
- Active Learning. (slides, pdf) Learning with queries. Negative results. Learning DFA from membershp and equivalence queries. The Zulu competition.
- Further questions and topics. About learning tree automata, tree grammars and transducers. Open questions.
Things you might want to read before
Grammatical inference is going to use techniques and results from machine learning
in order to learn formal grammars or finite automata from strings. An important issue is that of building algorithms and of measuring the complexity of these algorithms.
Therefore, it is supposed that students attending this course will have followed some sort of courses in:
- Formal language theory. Finite automata, regular languages, context free grammars.
- Algorithms and complexity. Having some knowledge on algorithms for strings would be helpful
- Probabilities and statistics. An elementary course is needed. Even if we will not be dealing with difficult statistics, the fact that we will be talking about testing, or
about distributions or that the objects we will want to learn (HMMS or probabilistic automata) all have to do with probabilities means something should be known
If you want to go further with the topic
- Some links
- The grammatical inference webpage. Sadly, updates have to be made...
- Zulu competition webpage. The Zulu query learning competition took place during 2010. The results have been published
in July 2010 but the website is open for anyone who wants to try to do better than the field!
- ICGI webpage. ICGI (the International Colloquium on Grammatical Inference) is the
place you want to go to in order to meet the community and see what people are working on. The next edition is taking place in September 2010 in the town of Valencia, Spain.
- Videolectures webpage for Colin de la Higuera and for the filmed edition
of ICGI 2008.
There is a tutorial from 2005 and a recent tutorial (3 hours, July 2010).
You can also find a number of interesting talks from the 2008 edition of ICGI, which is the workshop for grammatical inference specialists.
- Some things to read
- The book Grammatical inference: learning automata and grammars and
the book webpage
- Papers from the different ICGI conferences.
- Some links to bibliography lists
- If you want to do your research on the topic, perhaps a PhD, consider sending me a mail!