Deep Munich is a collaborative group of Deep Learning and Neural Network researchers in Munich. Our members represent:

Faculty for Informatics at TU (http://www.in.tum.de)
Institute for Informatics at LMU (https://www.ifi.lmu.de)
Center for Information and Language Processing (CIS) at LMU (http://www.cis.lmu.de)

Ask questions and discuss ideas in our forum: https://groups.google.com/forum/#!forum/deep-munich

Join us in our weekly meeting (see below).

Members

For questions, suggestions etc., please contact the group admin.

Meetups

Seminar in Neural Machine Translation

http://www.cis.uni-muenchen.de/~fraser/nmt_seminar_2015_WS/

Thursdays 14:30 s.t., room C105 (directions)

Center for Information and Language Processing
University of Munich
Oettingenstraße 67
80538 Munich

Resources

Andrej Karpathy’s blog, notably:
- The Unreasonable Effectiveness of Recurrent Neural Networks

Reading list

2015

Character-aware neural language models - Yoon Kim et al. - 2015 [1]
LSTM: A search space odyssey - Klaus Greff et al. - 2015 [2]
An empirical exploration of recurrent network architectures - Rafal Jozefowicz et al. - 2015 [3]
Teaching machines to read and comprehend - Karl Moritz Hermann et al. - 2015 [4]
Gated feedback recurrent neural networks - Junyoung Chung et al. - 2015 [5]

2014

Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition - Haşim Sak et al. - 2014 [6]
Show and tell: A neural image caption generator - Oriol Vinyals et al. - 2014 [7]
Recurrent neural network regularization - Wojciech Zaremba et al. - 2014 [8]
Modeling compositionality with multiplicative recurrent neural networks - Ozan İrsoy et al. - 2014 [9]
Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch - Xie Chen et al. - 2014 [10]
Learning phrase representations using rnn encoder-decoder for statistical machine translation - Kyunghyun Cho et al. - 2014 [11]
Deep captioning with multimodal recurrent neural networks (m-rNN) - Junhua Mao et al. - 2014 [12]
Learning longer memory in recurrent neural networks - Tomas Mikolov et al. - 2014 [13]
Learning sparse recurrent neural networks in language modeling - Yuanlong Shao - 2014 [14]
Recurrent deep neural networks for robust speech recognition - Chao Weng et al. - 2014 [15]
Recurrent neural network regularization - Wojciech Zaremba et al. - 2014 [8]

2013

On the difficulty of training recurrent neural networks. - Razvan Pascanu et al. - 2013 [16]
Speech recognition with deep recurrent neural networks - Alex Graves et al. - 2013 [17]
Hybrid speech recognition with deep bidirectional LSTM - Alex Graves et al. - 2013 [18]
Generating sequences with recurrent neural networks - Alex Graves - 2013 [19]
High-performance OCR for printed english and fraktur using LSTM networks - Thomas M Breuel et al. - 2013 [20]
Recurrent convolutional neural networks for discourse compositionality - Nal Kalchbrenner et al. - 2013 [21]
Comparison of feedforward and recurrent neural network language models - Martin Sundermeyer et al. - 2013 [22]
RNN language model with word clustering and class-based output layer - Yongzhe Shi et al. - 2013 [23]
Context dependent recurrent neural network language model. - Tomas Mikolov et al. - 2012 [24]

2012

A generalized LSTM-like training algorithm for second-order recurrent neural networks - Derek Monner et al. - 2012 [25]
Long-short term memory neural networks language modeling for handwriting recognition. - Volkmar Frinken et al. - 2012 [26]
LSTM neural networks for language modeling. - Martin Sundermeyer et al. - 2012 [27]

Pre-2012

Generating text with recurrent neural networks - Ilya Sutskever et al. - 2011 [28]
Named entity recognition with long short-term memory - James Hammerton - 2003 [29]
Gradient flow in recurrent nets: The difficulty of learning long-term dependencies - Sepp Hochreiter et al. - 2001 [30]
Long short-term memory - Sepp Hochreiter et al. - 1997 [31]
Learning to forget: Continual prediction with LSTM - Felix A Gers et al. - 2000 [32]
Learning precise timing with LSTM recurrent networks - Felix A Gers et al. - 2003 [33]
Generating sequences with recurrent neural networks - Alex Graves - 2013 [19]
Long short-term memory in recurrent neural networks - Felix Gers - 2001 [34]

Tools

Torch

Torch is an open source machine learning library, a scientific computing framework, and a script language based on the Lua programming language. It provides a wide range of algorithms for deep machine learning, and uses an extremely fast scripting language LuaJIT, and an underlying C implementation. ~ Wikipedia

Code bases for Torch

Multi-layer character-level Recurrent Neural Network: https://github.com/karpathy/char-rnn
- Fork for word-level RNN (a little outdated): https://github.com/Graydyn/char-rnn
RNN module for Torch nn: https://github.com/Element-Research/rnn
Character-Aware Neural Language Models: https://github.com/yoonkim/lstm-char-cnn

General NN Resources

Online textbooks

Video courses

Blogs & Articles

Presentations

Deep Learning Summer School, Montreal 2015

References

[1] Y. Kim, Y. Jernite, D. Sontag, and A. M. Rush, “Character-aware neural language models,” arXiv preprint arXiv:1508.06615, 2015.

[2] K. Greff, R. K. Srivastava, J. Koutník, B. R. Steunebrink, and J. Schmidhuber, “LSTM: A search space odyssey,” arXiv preprint arXiv:1503.04069, 2015.

[3] R. Jozefowicz, W. Zaremba, and I. Sutskever, “An empirical exploration of recurrent network architectures,” in Proceedings of the 32^nd international conference on machine learning, 2015, pp. 2342–2350.

[4] K. M. Hermann, T. Kočisky, E. Grefenstette, L. Espeholt, W. Kay, M. Suleyman, and P. Blunsom, “Teaching machines to read and comprehend,” arXiv preprint arXiv:1506.03340, 2015.

[5] J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, “Gated feedback recurrent neural networks,” arXiv preprint arXiv:1502.02367, 2015.

[6] H. Sak, A. Senior, and F. Beaufays, “Long short-term memory based recurrent neural network architectures for large vocabulary speech recognition,” arXiv preprint arXiv:1402.1128, 2014.

[7] O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, “Show and tell: A neural image caption generator,” arXiv preprint arXiv:1411.4555, 2014.

[8] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent neural network regularization,” arXiv preprint arXiv:1409.2329, 2014.

[9] O. İrsoy and C. Cardie, “Modeling compositionality with multiplicative recurrent neural networks,” arXiv preprint arXiv:1412.6577, 2014.

[10] X. Chen, Y. Wang, X. Liu, M. J. Gales, and P. C. Woodland, “Efficient GPU-based training of recurrent neural network language models using spliced sentence bunch,” submitted to Proc. ISCA Interspeech, 2014.

[11] K. Cho, B. van Merrienboer, C. Gulcehre, F. Bougares, H. Schwenk, and Y. Bengio, “Learning phrase representations using rnn encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078, 2014.

[12] J. Mao, W. Xu, Y. Yang, J. Wang, and A. Yuille, “Deep captioning with multimodal recurrent neural networks (m-rNN),” arXiv preprint arXiv:1412.6632, 2014.

[13] T. Mikolov, A. Joulin, S. Chopra, M. Mathieu, and M. Ranzato, “Learning longer memory in recurrent neural networks,” arXiv preprint arXiv:1412.7753, 2014.

[14] Y. Shao, “Learning sparse recurrent neural networks in language modeling,” PhD thesis, The Ohio State University, 2014.

[15] C. Weng, D. Yu, S. Watanabe, and B.-H. F. Juang, “Recurrent deep neural networks for robust speech recognition,” in Acoustics, speech and signal processing (iCASSP), 2014 iEEE international conference on, 2014, pp. 5532–5536.

[16] R. Pascanu, T. Mikolov, and Y. Bengio, “On the difficulty of training recurrent neural networks.” in ICML (3), 2013, vol. 28, pp. 1310–1318.

[17] A. Graves, A.-r. Mohamed, and G. Hinton, “Speech recognition with deep recurrent neural networks,” arXiv preprint arXiv:1303.5778, 2013.

[18] A. Graves, N. Jaitly, and A.-r. Mohamed, “Hybrid speech recognition with deep bidirectional LSTM,” in Automatic speech recognition and understanding (aSRU), 2013 iEEE workshop on, 2013, pp. 273–278.

[19] A. Graves, “Generating sequences with recurrent neural networks,” arXiv preprint arXiv:1308.0850, 2013.

[20] T. M. Breuel, A. Ul-Hasan, M. A. Al-Azawi, and F. Shafait, “High-performance OCR for printed english and fraktur using LSTM networks,” in Document analysis and recognition (iCDAR), 2013 12^th international conference on, 2013, pp. 683–687.

[21] N. Kalchbrenner and P. Blunsom, “Recurrent convolutional neural networks for discourse compositionality,” arXiv preprint arXiv:1306.3584, 2013.

[22] M. Sundermeyer, I. Oparin, J.-L. Gauvain, B. Freiberg, R. Schluter, and H. Ney, “Comparison of feedforward and recurrent neural network language models,” in Acoustics, speech and signal processing (iCASSP), 2013 iEEE international conference on, 2013, pp. 8430–8434.

[23] Y. Shi, W.-Q. Zhang, J. Liu, and M. T. Johnson, “RNN language model with word clustering and class-based output layer,” EURASIP Journal on Audio, Speech, and Music Processing, vol. 2013, no. 1, pp. 1–7, 2013.

[24] T. Mikolov and G. Zweig, “Context dependent recurrent neural network language model.” in SLT, 2012, pp. 234–239.

[25] D. Monner and J. A. Reggia, “A generalized LSTM-like training algorithm for second-order recurrent neural networks,” Neural Networks, vol. 25, pp. 70–83, 2012.

[26] V. Frinken, F. Zamora-Martínez, S. E. Boquera, M. J. C. Bleda, A. Fischer, and H. Bunke, “Long-short term memory neural networks language modeling for handwriting recognition.” in ICPR, 2012, pp. 701–704.

[27] M. Sundermeyer, R. Schlüter, and H. Ney, “LSTM neural networks for language modeling.” in INTERSPEECH, 2012.

[28] I. Sutskever, J. Martens, and G. E. Hinton, “Generating text with recurrent neural networks,” in Proceedings of the 28th international conference on machine learning (iCML-11), 2011, pp. 1017–1024.

[29] J. Hammerton, “Named entity recognition with long short-term memory,” in Proceedings of coNLL-2003, 2003, pp. 172–175.

[30] S. Hochreiter, Y. Bengio, P. Frasconi, and J. Schmidhuber, “Gradient flow in recurrent nets: The difficulty of learning long-term dependencies.” Citeseer, 2001.

[31] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” vol. 9, no. 8, pp. 1735–1780, 1997.

[32] F. A. Gers, J. Schmidhuber, and F. Cummins, “Learning to forget: Continual prediction with LSTM,” Neural computation, vol. 12, no. 10, pp. 2451–2471, 2000.

[33] F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise timing with LSTM recurrent networks,” The Journal of Machine Learning Research, vol. 3, pp. 115–143, 2003.

[34] F. Gers, “Long short-term memory in recurrent neural networks,” Unpublished PhD dissertation, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland, 2001.