The Poio Corpus is a freely available collection of language resources for the lesser-used languages. The data is extracted from free sources like Wikipedia, dictionaries, documents, websites and others.
The official Poio Corpus website is: https://www.poio.eu
Poio Corpus is part of the Poio project: https://github.com/Poio-NLP
The documentation site of Poio is here:
Poio Corpus source code is distributed under the Apache 2.0 License.