"We think in generalities, but we live in details." (Alfred North Whitehead)
Natural language is ambiguous, and systems which process natural language have to incorporate components which resolve ambiguity. Clearly, Probability Theory is the theory of uncertainty/ambiguity. Thus, I am building probabilistic models for NLP systems. For modeling, I often use the EM Algorithm. This statistical algorithm is able to reveal -- and thus to exploit -- even hidden information or hidden structure of its input data. More specifically, I am interested in two tasks complementing one another rather often: (i) un/supervised probabilistic modeling of natural-language grammars, and (ii) induction of lexical semantic information.

Right now, I am working as a visiting professor at the Department of Computational Linguistics at the University of Heidelberg. Previously, I carried out my research within the following projects:

June/July 2004

Guest researcher in a six-weeks summer project:
Automatic and Semi-Automatic Treebank Refinement.
IMS, University of Stuttgart.
[Other people involved: Helmut Schmid (IMS) and Tylman Ule (SfS, University of Tübingen).]

Oct 2003 - Sep 2006

Postdoctoral researcher in Khalil Sima'an's NWO project:
Learning Stochastic Tree-grammars from Treebanks.
ILLC, University of Amsterdam.
[Other people involved: Remko Scha, Andreas Zollmann.]

Jan 2003 - Sep 2003

Postdoctoral researcher in Maarten de Rijke's NWO project:
Computing with Meaning.
ILLC, University of Amsterdam.
[People I worked with: Gabriel Infante-Lopez, Valentin Jijkoun, Gilad Mishne, Karin Müler, Khalil Sima'an.]
[I also worked on an EM-based clustering model for semantic role labeling at that time.
Other people involved: Ulrike Baldewein, Katrin Erk, Sebastian Pado (all at CoLi, University of Saarland).]

2nd June 2002

Co-organiser of an international workshop:
Beyond PARSEVAL - Towards Improved Evaluation Measures for Parsing Systems.
LREC 2002, Las Palmas.
[Other co-organisers: John Carroll, Anette Frank, Dekang Lin, Hans Uszkoreit.]

Jan 2001 - Dec 2002

Senior researcher in Hans Uskoreit's EU project:
MUCHMORE - Cross-lingual Information Retrieval for the Medical Domain.
DFKI GmbH, Saarbrücken.
[Other DFKI people involved: Paul Buitelaar, Diana Raileanu, Bogdan Sacaleanu, Spela Vintar.]
[I also worked on a German treebank grammar, as well as on a PCFG approximation for an English HPSG at that time.
Other people involved in the treebank stuff: Sisay Fissaha, Karin Müller (both at Coli, University of Saarland).
Other people involved in the approximation stuff: Bernd Kiefer, Hans-Ulrich Krieger (both at DFKI).]

Jun 1997 - Dec 2000

Researcher in Mats Rooth's meta project GRAMOTRON, as well as in two real-world projects:
 ♦ SFB340 B7 - Statistical Grammar Models and Lexicon Acquisition. (DFG project, from Jan 1999)
 ♦ SPARKLE - Shallow Parsing and Knowledge Extraction. (EU project, until Dec 1998)
IMS, University of Stuttgart.
[Other GRAMOTRON people: Franz Beil, Glenn Carroll, Marc Light, Stefan Riezler, Helmut Schmid, Sabine Schulte im Walde.]
Researcher in Christian Rohrer's package of the BMBF project Verbmobil:
AP3.4 - Resolving translation ambiguities in a mobile translation system.
IMS, University of Stuttgart.
[Other AP3.4 people: Andreas Eisele, Michael Schiehlen.]
[I also worked on a probabilistic grapheme-to-phoneme (G2P) converter, as well as on a statistical model for a German LFG at that time.
Other people involved in the G2P stuff: Bernd Möbius, Karin Müller (both at IMS).
Other people involved in the LFG stuff: Jonas Kuhn, Stefan Riezler (both at IMS), Mark Johnson (Brown University, Boston).]
