Deep Munich is a collaborative group of Deep Learning and Neural Network researchers in Munich. Our members represent:

- Faculty for Informatics at TU (http://www.in.tum.de)
- Institute for Informatics at LMU (https://www.ifi.lmu.de)
- Center for Information and Language Processing (CIS) at LMU (http://www.cis.lmu.de)

Ask questions and discuss ideas in our *forum*: https://groups.google.com/forum/#!forum/deep-munich

Sign up for our *mailing list* and stay up-to-date: https://lists.lrz.de/mailman/listinfo/deep

Join us in our *weekly meeting* (see below).

- David Kaumanns (CIS) - group organizer
- Alexander Fraser (CIS)
- Helmut Schmid (CIS)
- Evgeniy Faerman (LMU)
- Yadollah Yaghoobzadeh
- TBA

For questions, suggestions etc., please contact the group admin.

http://www.cis.uni-muenchen.de/~fraser/nmt_seminar_2015_WS/

Thursdays 14:30 s.t., room C105 (directions)

Center for Information and Language Processing

University of Munich

Oettingenstraße 67

80538 Munich

- Character-aware neural language models - Yoon Kim et al. - 2015 [1]
- LSTM: A search space odyssey - Klaus Greff et al. - 2015 [2]
- An empirical exploration of recurrent network architectures - Rafal Jozefowicz et al. - 2015 [3]
- Teaching machines to read and comprehend - Karl Moritz Hermann et al. - 2015 [4]
- Gated feedback recurrent neural networks - Junyoung Chung et al. - 2015 [5]

- Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition - Haşim Sak et al. - 2014 [6]
- Show and tell: A neural image caption generator - Oriol Vinyals et al. - 2014 [7]
- Recurrent neural network regularization - Wojciech Zaremba et al. - 2014 [8]
- Modeling compositionality with multiplicative recurrent neural networks - Ozan İrsoy et al. - 2014 [9]
- Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch - Xie Chen et al. - 2014 [10]
- Learning phrase representations using rnn encoder-decoder for statistical machine translation - Kyunghyun Cho et al. - 2014 [11]
- Deep captioning with multimodal recurrent neural networks (m-rNN) - Junhua Mao et al. - 2014 [12]
- Learning longer memory in recurrent neural networks - Tomas Mikolov et al. - 2014 [13]
- Learning sparse recurrent neural networks in language modeling - Yuanlong Shao - 2014 [14]
- Recurrent deep neural networks for robust speech recognition - Chao Weng et al. - 2014 [15]
- Recurrent neural network regularization - Wojciech Zaremba et al. - 2014 [8]

- On the difficulty of training recurrent neural networks. - Razvan Pascanu et al. - 2013 [16]
- Speech recognition with deep recurrent neural networks - Alex Graves et al. - 2013 [17]
- Hybrid speech recognition with deep bidirectional LSTM - Alex Graves et al. - 2013 [18]
- Generating sequences with recurrent neural networks - Alex Graves - 2013 [19]
- High-performance OCR for printed english and fraktur using LSTM networks - Thomas M Breuel et al. - 2013 [20]
- Recurrent convolutional neural networks for discourse compositionality - Nal Kalchbrenner et al. - 2013 [21]
- Comparison of feedforward and recurrent neural network language models - Martin Sundermeyer et al. - 2013 [22]
- RNN language model with word clustering and class-based output layer - Yongzhe Shi et al. - 2013 [23]
- Context dependent recurrent neural network language model. - Tomas Mikolov et al. - 2012 [24]

- A generalized LSTM-like training algorithm for second-order recurrent neural networks - Derek Monner et al. - 2012 [25]
- Long-short term memory neural networks language modeling for handwriting recognition. - Volkmar Frinken et al. - 2012 [26]
- LSTM neural networks for language modeling. - Martin Sundermeyer et al. - 2012 [27]

- Generating text with recurrent neural networks - Ilya Sutskever et al. - 2011 [28]
- Named entity recognition with long short-term memory - James Hammerton - 2003 [29]
- Gradient flow in recurrent nets: The difficulty of learning long-term dependencies - Sepp Hochreiter et al. - 2001 [30]
- Long short-term memory - Sepp Hochreiter et al. - 1997 [31]
- Learning to forget: Continual prediction with LSTM - Felix A Gers et al. - 2000 [32]
- Learning precise timing with LSTM recurrent networks - Felix A Gers et al. - 2003 [33]
- Generating sequences with recurrent neural networks - Alex Graves - 2013 [19]
- Long short-term memory in recurrent neural networks - Felix Gers - 2001 [34]

Torch is an open source machine learning library, a scientific computing framework, and a script language based on the Lua programming language. It provides a wide range of algorithms for deep machine learning, and uses an extremely fast scripting language LuaJIT, and an underlying C implementation. ~ Wikipedia

- Lua in 15 minutes
- Getting started with Torch
- Torch 7 Google group
- Deep Learning with Torch: the 60-minute blitz

- Multi-layer character-level Recurrent Neural Network: https://github.com/karpathy/char-rnn
- Fork for word-level RNN (a little outdated): https://github.com/Graydyn/char-rnn

- RNN module for Torch nn: https://github.com/Element-Research/rnn
- Character-Aware Neural Language Models: https://github.com/yoonkim/lstm-char-cnn

- Neural Networks and Deep Learning (Michael Nielsen)
- Deep Learning (Yoshua Bengio, Ian Goodfellow and Aaron Courville)
- Python Machine Learning Book
- A Tutorial on Deep Learning (Quoc V. Le)

- Neural Networks for Machine Learning, Hinton (Coursera)
- Machine Learning, Andrew Ng (Coursera)
- Machine Learning, Pedro Domingos (Coursera)
- Machine Learning Summer School 2014
- TechTalks from ACL-IJCNLP 2015
- CS224d: Deep Learning for Natural Language Processing

- A brief history of word embeddings
- The AI Revolution: The Road to Superintelligence (Tim Urban)
- Backpropagation Tutorial (Manfred Zabarauskas)
- Thoughts on Machine Learning and Natural Language Processing (Marek Rei), e.g.:
- Colah’s blog, notably:
- LSTM implementation explained (Adam Paszke)
- Deep Learning News
- WildML (Denny Britz)

[1] Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, “Character-aware neural language models,” *arXiv preprint arXiv:1508.06615*, 2015.

[2] K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A search space odyssey,” *arXiv preprint arXiv:1503.04069*, 2015.

[3] R. Jozefowicz, W. Zaremba, and I. Sutskever, “An empirical exploration of recurrent network architectures,” in *Proceedings of the 32 ^{nd} international conference on machine learning*, 2015, pp. 2342–2350.

[4] K. M. Hermann, T. Kočisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom, “Teaching machines to read and comprehend,” *arXiv preprint arXiv:1506.03340*, 2015.

[5] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Gated feedback recurrent neural networks,” *arXiv preprint arXiv:1502.02367*, 2015.

[6] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition,” *arXiv preprint arXiv:1402.1128*, 2014.

[7] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” *arXiv preprint arXiv:1411.4555*, 2014.

[8] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” *arXiv preprint arXiv:1409.2329*, 2014.

[9] O. İrsoy and C. Cardie, “Modeling compositionality with multiplicative recurrent neural networks,” *arXiv preprint arXiv:1412.6577*, 2014.

[10] X. Chen, Y. Wang, X. Liu, M. J. Gales, and P. C. Woodland, “Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch,” *submitted to Proc. ISCA Interspeech*, 2014.

[11] K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” *arXiv preprint arXiv:1406.1078*, 2014.

[12] J. Mao, W. Xu, Y. Yang, J. Wang, and A. Yuille, “Deep captioning with multimodal recurrent neural networks (m-rNN),” *arXiv preprint arXiv:1412.6632*, 2014.

[13] T. Mikolov, A. Joulin, S. Chopra, M. Mathieu, and M. Ranzato, “Learning longer memory in recurrent neural networks,” *arXiv preprint arXiv:1412.7753*, 2014.

[14] Y. Shao, “Learning sparse recurrent neural networks in language modeling,” PhD thesis, The Ohio State University, 2014.

[15] C. Weng, D. Yu, S. Watanabe, and B.-H. F. Juang, “Recurrent deep neural networks for robust speech recognition,” in *Acoustics, speech and signal processing (iCASSP), 2014 iEEE international conference on*, 2014, pp. 5532–5536.

[16] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training recurrent neural networks.” in *ICML (3)*, 2013, vol. 28, pp. 1310–1318.

[17] A. Graves, A.-r. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” *arXiv preprint arXiv:1303.5778*, 2013.

[18] A. Graves, N. Jaitly, and A.-r. Mohamed, “Hybrid speech recognition with deep bidirectional LSTM,” in *Automatic speech recognition and understanding (aSRU), 2013 iEEE workshop on*, 2013, pp. 273–278.

[19] A. Graves, “Generating sequences with recurrent neural networks,” *arXiv preprint arXiv:1308.0850*, 2013.

[20] T. M. Breuel, A. Ul-Hasan, M. A. Al-Azawi, and F. Shafait, “High-performance OCR for printed english and fraktur using LSTM networks,” in *Document analysis and recognition (iCDAR), 2013 12 ^{th} international conference on*, 2013, pp. 683–687.

[21] N. Kalchbrenner and P. Blunsom, “Recurrent convolutional neural networks for discourse compositionality,” *arXiv preprint arXiv:1306.3584*, 2013.

[22] M. Sundermeyer, I. Oparin, J.-L. Gauvain, B. Freiberg, R. Schluter, and H. Ney, “Comparison of feedforward and recurrent neural network language models,” in *Acoustics, speech and signal processing (iCASSP), 2013 iEEE international conference on*, 2013, pp. 8430–8434.

[23] Y. Shi, W.-Q. Zhang, J. Liu, and M. T. Johnson, “RNN language model with word clustering and class-based output layer,” *EURASIP Journal on Audio, Speech, and Music Processing*, vol. 2013, no. 1, pp. 1–7, 2013.

[24] T. Mikolov and G. Zweig, “Context dependent recurrent neural network language model.” in *SLT*, 2012, pp. 234–239.

[25] D. Monner and J. A. Reggia, “A generalized LSTM-like training algorithm for second-order recurrent neural networks,” *Neural Networks*, vol. 25, pp. 70–83, 2012.

[26] V. Frinken, F. Zamora-Martínez, S. E. Boquera, M. J. C. Bleda, A. Fischer, and H. Bunke, “Long-short term memory neural networks language modeling for handwriting recognition.” in *ICPR*, 2012, pp. 701–704.

[27] M. Sundermeyer, R. Schlüter, and H. Ney, “LSTM neural networks for language modeling.” in *INTERSPEECH*, 2012.

[28] I. Sutskever, J. Martens, and G. E. Hinton, “Generating text with recurrent neural networks,” in *Proceedings of the 28th international conference on machine learning (iCML-11)*, 2011, pp. 1017–1024.

[29] J. Hammerton, “Named entity recognition with long short-term memory,” in *Proceedings of coNLL-2003*, 2003, pp. 172–175.

[30] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, “Gradient flow in recurrent nets: The difficulty of learning long-term dependencies.” Citeseer, 2001.

[31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” vol. 9, no. 8, pp. 1735–1780, 1997.

[32] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” *Neural computation*, vol. 12, no. 10, pp. 2451–2471, 2000.

[33] F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise timing with LSTM recurrent networks,” *The Journal of Machine Learning Research*, vol. 3, pp. 115–143, 2003.

[34] F. Gers, “Long short-term memory in recurrent neural networks,” *Unpublished PhD dissertation, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland*, 2001.