University of Stuttgart - Machine Translation Reading Group

Thomas Schoenemann - Regularizing Word Alignment

Regularizing Word Alignment

Thomas Schoenemann


This talk is about improving the conditional models IBM1 and HMM for word alignment by adding prior knowledge in the form of regularity terms. We explore $L_0$ and (weighted) $L_1$ norms to address two common defects: the garbage collection problem and the fact that words are aligned to many more distinct words than desired.

The computational methods employed are quite diverse: for $L_0$ a discrete optimization approach, derived from maximum approximations, is used. In contrast, the $L_1$ is optimized by EM with efficient projections on simplices.


Thomas Schoenemann was born and grew up in Germany. He studied Computer Science at RWTH Aachen, Germany, where he got a diploma in 2005, having written his diploma thesis on the topic of confidence measures in machine translation in the group of Hermann Ney. Afterwards he went to the University of Bonn, Germany, to do his Ph.D. thesis in computer vision in the years 2006-2008. Up to the end of March he was a postdoc in the vision group at Lund University, Sweden, where he also resumed his work on translation. Currently he is looking for a new group, while exploring different fields.

For scheduling information, please see the Stuttgart reading group page.