Skip to content

Commit

Permalink
Merge pull request #80 from mozilla/whitespace
Browse files Browse the repository at this point in the history
common.py --- collapse whitespace for all langs
  • Loading branch information
JRMeyer authored Jan 31, 2019
2 parents b0cbeb8 + ba2b52e commit 0eafd2a
Showing 1 changed file with 2 additions and 0 deletions.
2 changes: 2 additions & 0 deletions src/corporacreator/preprocessors/common.py
Original file line number Diff line number Diff line change
Expand Up @@ -75,5 +75,7 @@ def common(sentence):
sentence = _strip_tags(sentence)
# Remove non-printable characters
sentence = _strip_string(sentence)
# collapse all whitespace and replace with single space
sentence = (' ').join(sentence.split())
# TODO: Clean up data in a language independent manner
return sentence

0 comments on commit 0eafd2a

Please sign in to comment.