Skip to content

Commit

Permalink
Fixed #14 (Some sentences contain URL encoded text)
Browse files Browse the repository at this point in the history
  • Loading branch information
kdavis-mozilla committed Dec 13, 2018
1 parent d6cae0f commit a729459
Showing 1 changed file with 4 additions and 0 deletions.
4 changes: 4 additions & 0 deletions src/corporacreator/preprocessors/common.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
from urllib.parse import unquote

def common(sentence):
"""Cleans up the passed sentence in a language independent manner, removing or reformatting invalid data.
Expand All @@ -7,5 +9,7 @@ def common(sentence):
Returns:
(str): Cleaned up sentence.
"""
# Decode any URL encoded elements of sentence
sentence = unquote(sentence)
# TODO: Clean up data in a language independent manner
return sentence

0 comments on commit a729459

Please sign in to comment.