Seminar in Neural Machine Translation

NEW: Advanced NMT is offered in SS 2016, click here.


Neural Machine Translation (NMT) is a new paradigm in data-driven machine translation. Previous generation Statistical Machine Translation (SMT) systems are built using a collection of heuristic models, typically combined in a log-linear model with a small number of parameters. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences. While in SMT systems, word-alignment is carried out, and then fixed, and then various sub-models are estimated from the word-aligned data, this is not the case in NMT. In NMT, fixed word-alignments are not used, and instead the full sequence to sequence task is handled in one model.

Here is a link to last semester's seminar.

NEW: David Kaumanns is also organizing a Munich interest group for Deep Learning, which has an associated mailing list. See the link here:


Alexander Fraser

Email Address:

CIS, LMU Munich


NEW TIME: 14:30 (was 14:00 before)!!!

Thursdays 14:30 s.t., location is C105 (CIS Besprechungsraum).

Click here for directions to CIS.

If this page appears to be out of date, use the refresh button of your browser

Date Paper Links Discussion Leader
Thursday, November 5th Y Bengio, R Ducharme, P Vincent (2003). A neural probabilistic language model. Journal of Machine Learning Research 3, 1137-1155 pdf Helmut Schmid
Thursday, November 12th Sundermeyer, M.; Schlüter, R. & Ney, H (2012). LSTM Neural Networks for Language Modeling. Interspeech pdf David Kaumanns
Thursday, November 19th Graves, Alex (2014). Generating Sequences With Recurrent Neural Networks. Neural and Evolutionary Computing link Alex Fraser
Thursday, November 26th Kalchbrenner, Nal, Phil Blunsom (2013). Recurrent Continuous Translation Models. EMNLP. pdf Usama Yaseen
Thursday, December 3rd Kyunghyun Cho, Bart van Merrienboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, Yoshua Bengio (2014). Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. EMNLP link Ales Tamchyna
Thursday, December 10th Sutskever, Ilya, Oriol Vinyals, and Quoc V Le (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems. link Stefan Gerdjikov
Thursday, December 17th Presentation on Exploding Gradient (no reading but see: Hochreiter, Schmidhuber: Long Short-Term Memory, Neural Computation 9(8):1735-1780, 1997. Sections 3 and 4) Christian Meyer
Thursday, January 14th Bahdanau, Dzmitry, Kyunghyun Cho, Yoshua Bengio (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. link Helmut Schmid
Thursday, January 28th Yaming Sun et al (2015). Modeling Mention, Context and Entity with Neural Networks for Entity Disambiguation. IJCAI. pdf Yadollah Yaghoobzadeh
Thursday, February 4th Jean, Sébastien, Kyunghyun Cho, Roland Memisevic, Yoshua Bengio (2015). On Using Very Large Target Vocabulary for Neural Machine Translation. link Tsuyoshi Okita
Thursday, February 18th Gulcehre, Caglar, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio (2015). On Using Monolingual Corpora in Neural Machine Translation. link Ben Roth
Thursday, March 3rd Stanford Neural Machine Translation Systems for Spoken Language Domain. Minh-Thang Luong and Christopher D. Manning. IWSLT 2015 shared task. paper slides Alex Fraser

Further literature:

Please click here for an NMT reading list, but also see the more general RNN reading list here (scroll down).