Statistical Machine Translation (SMT) was the dominant approach used for online translation until 2015. Neural Machine Translation (NMT) is the new dominant approach.

Neural Machine Translation (NMT) is a new paradigm in data-driven machine translation. Previous generation Statistical Machine Translation (SMT) systems are built using a collection of heuristic models, typically combined in a log-linear model with a small number of parameters. In Neural Machine Translation, the entire translation process is posed as an end-to-end supervised classification problem, where the training data is pairs of sentences. While in SMT systems, word-alignment is carried out, and then fixed, and then various sub-models are estimated from the word-aligned data, this is not the case in NMT. In NMT, fixed word-alignments are not used, and instead the full sequence to sequence task is handled in one model.

Content:

The seminar will begin with the basics of Statistical Machine Translation and then briefly introduce Deep Learning before covering the basics of Neural Machine Translation.

Goals:

The goal of the seminar is to understand the basics of SMT and NMT. The varying role of the lexicon (and representations of the lexicon) in these approaches is a critical aspect which will be a focus of study.

Email Address: SubstituteMyLastName@cis.uni-muenchen.de

Room U139, Tuesdays, 16:00 to 18:00 (c.t.)

Date | Topic | Reading (DO BEFORE THE MEETING!) | Slides |

October 18th | Introduction to Statistical Machine Translation | ppt pdf | |

October 25th | Bitext alignment (extracting lexical knowledge from parallel corpora) | ppt pdf | |

November 8th | Many-to-many alignments and Phrase-based model | ppt pdf | |

November 15th | Log-linear model and Minimum Error Rate Training Referat | ppt pdf Fraser Braune/Huck | |

November 22nd | Decoding (Guest Lecture from Tsuyoshi Okita) | ||

November 29th | Introduction to Linear Models (SLIDES UPDATED!) | pptx pdf | |

December 6th | Neural Networks (and Word Embeddings), Fabienne Braune | ||

December 13th | Recurrent Neural Networks, Tsuyoshi Okita | ||

December 20th | SMT: Advanced Word Alignment, Morphology, Syntax | ppt pdf | |

January 24th | Neural Machine Translation, Matthias Huck |

Referatsthemen (name: topic)

Date | Topic | Materials | Hausarbeit Received |

January 10th | Palchik: Word-Sense-Disambiguation and WSD for SMT | yes | |

January 10th | Deck: Computer-Aided Translation | yes | |

January 17th | Bilan: Cross-Lingual Lexical Substitution | yes | |

January 17th | Sedinkina: Wikification of Ambiguous Entities | yes | |

January 24th | SEE ABOVE | ||

January 31st | Poerner: System Combination | yes | |

January 31st | Krachenfels: Neural Parsing with Gated Recursive Convolutional Networks | yes |

**Literature**:

Philipp Koehn's book Statistical Machine Translation

Kevin Knight's tutorial on SMT (particularly look at IBM Model 1)