Intensive Introduction to Neural Machine Translation

Summary

Neural Machine Translation (NMT) is a new paradigm in data-driven machine translation. Previous generation Statistical Machine Translation (SMT) systems are built using a collection of heuristic models, typically combined in a log-linear model with a small number of parameters. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences. While in SMT systems, word-alignment is carried out, and then fixed, and then various sub-models are estimated from the word-aligned data, this is not the case in NMT. In NMT, fixed word-alignments are not used, and instead the full sequence to sequence task is handled in one model.

The course will work backwards from the current state of the art in NMT, which is the "ensemble" system submitted by the Bengio group in Montreal to the 2015 shared task on machine translation (Jean et al. 2015, see below, with some additional details to be published). Depending on the background of the participants, some basics of SMT may also be covered.

Instructors

Alexander Fraser

Email Address: SubstituteLastName@cis.uni-muenchen.de

CIS, LMU Munich


Ryan Cotterell

Email Address: SubstituteFirstName.SubstituteLastName@gmail.com

JHU and LMU

Schedule

15:00 s.t., location is C105 (CIS Besprechungsraum).

August 11th, 2015 Concluding discussion, plans for next semester
August 4th, 2015 Learning to Forget: Continual Prediction with LSTM. Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins. Neural Computation, October 2000. ftp://ftp.idsia.ch/pub/juergen/FgGates-NC.pdf
July 28th, 2015 Learning to Forget: Continual Prediction with LSTM. Felix A. Gers, Jürgen Schmidhuber, and Fred Cummins. Neural Computation, October 2000. ftp://ftp.idsia.ch/pub/juergen/FgGates-NC.pdf
July 21st, 2015 Sutskever, Ilya, Oriol Vinyals, and Quoc V Le (2014). Sequence to sequence learning with neural networks. Advances in Neural Information Processing Systems. http://arxiv.org/abs/1409.3215
July 14th, 2015 Gulcehre, Caglar, Orhan Firat, Kelvin Xu, Kyunghyun Cho, Loic Barrault, Huei-Chi Lin, Fethi Bougares, Holger Schwenk, Yoshua Bengio (2015). On Using Monolingual Corpora in Neural Machine Translation. http://arxiv.org/abs/1503.03535
July 7th, 2015 Bahdanau, Dzmitry, Kyunghyun Cho, Yoshua Bengio (2015). Neural Machine Translation by Jointly Learning to Align and Translate. ICLR. http://arxiv.org/abs/1409.0473
June 30th, 2015 Jean, Sébastien, Kyunghyun Cho, Roland Memisevic, Yoshua Bengio (2015). On Using Very Large Target Vocabulary for Neural Machine Translation. http://arxiv.org/abs/1412.2007
June 23rd, 2015 Introduction to Neural Machine Translation
June 16th, 2015 Organizational Meeting


Further literature:

Please click here for an NMT reading list, and here for a short list of LSTM papers recommended by David Kaumanns.