An evaluation of different NLP toolkits with focus on processing German texts.
- NLTK, Python
- Stanford CoreNLP, Java + many bindings (GPL 3.0)
- spaCy, Python
- TextBlob, Python
- Pattern, Python 2
- MBSP, Python 2
- Apache OpenNLP, Java (Apache License)
The NLP code examples are written in Python and can be executed in interactive Jupyter notebooks.
Some basic Python packages and the necessary NLP libraries are provided via a docker image (https://hub.docker.com/r/jupyter/minimal-notebook/).
- Sentence Splitting
- POS Tagging
- Named Entity Recognition