All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
- Fix encoding on previous version while reading a file using the CLI.
- Add functionality to normalize a single file using cucco CLI.
- Renamed 'remove_extra_whitespaces' to 'remove_extra_white_spaces'.
- New language argument in remove_stop_words function.
- Lazy loading stop words file now loads always the default language specified in the Config class.
- Command line interface.
- Config class to manage cucco configuration and handle normalizations. This class allows to load normalizations to apply from a yaml file.
- Debug log messages to see what -the cucco- is happening behind the scenes.
- Order of default normalizations to remove extra white spaces after removing punctuation.
- New stop words files. Cucco is hungry for words and now can deal with 50 laguages.