RNNTagger - a Neural Part-of-Speech Tagger


The RNNTagger is a tool for annotating text with part-of-speech and lemma information. It comes with pretrained parameter files for many languages. RNNTagger was implemented in Python using the Deep Learning library PyTorch.

Compared to TreeTagger, the pros of RNNTagger are

The cons are:


Download

This software is freely available for research, education and evaluation. For commercial and other licenses, please contact the developer via the email address at the bottom of the page.

Please read the license terms, before you download the software! By downloading the software, you agree to the terms stated there.

The following steps are required to install RNNTagger on Linux:

Now, you can open a command-line shell, change to the newly created directory RNNTagger, and enter the commands:

> echo "This is a test." > test.txt
> cmd/rnn-tagger-english.sh test.txt
This will produce the output:

This DT this 
is VBZ be 
DT 
test NN test 
SENT 


Currently supported modern languages: Bulgarian, Catalan, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hindi, Hungarian, Icelandic, Italian, Korean, Norwegian, Polish, Portuguese, Romanian, Russian, Slovak, Slovenian, Spanish, Swahili

Currently supported ancient languages: Latin, Middle Dutch, Middle English, Middle High German, Old French, Old Greek, Old Icelandic, Old Italian

The tagger package contains a README file with further information on the tagger and the parameter files.


Please send questions, comments, suggestions and bug reports to Helmut Schmid at LastName@cis.lmu.de.