-
-
Notifications
You must be signed in to change notification settings - Fork 9.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Switch LGPL'd chardet for MIT licensed charset_normalizer (#5797)
Although using the (non-vendored) chardet library is fine for requests itself, but using a LGPL dependency the story is a lot less clear for downstream projects, particularly ones that might like to bundle requests (and thus chardet) in to a single binary -- think something similar to what docker-compose is doing. By including an LGPL'd module it is no longer clear if the resulting artefact must also be LGPL'd. By changing out this dependency for one under MIT we remove all license ambiguity. As an "escape hatch" I have made the code so that it will use chardet first if it is installed, but we no longer depend upon it directly, although there is a new extra added, `requests[lgpl]`. This should minimize the impact to users, and give them an escape hatch if charset_normalizer turns out to be not as good. (In my non-exhaustive tests it detects the same encoding as chartdet in every case I threw at it) Co-authored-by: Jarek Potiuk <[email protected]> Co-authored-by: Jarek Potiuk <[email protected]>
- Loading branch information
Showing
10 changed files
with
119 additions
and
27 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -23,6 +23,12 @@ env/ | |
|
||
.workon | ||
|
||
# in case you work with IntelliJ/PyCharm | ||
.idea | ||
*.iml | ||
.python-version | ||
|
||
|
||
t.py | ||
|
||
t2.py | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,26 @@ | ||
import sys | ||
|
||
try: | ||
import chardet | ||
except ImportError: | ||
import charset_normalizer as chardet | ||
import warnings | ||
|
||
warnings.filterwarnings('ignore', 'Trying to detect', module='charset_normalizer') | ||
|
||
# This code exists for backwards compatibility reasons. | ||
# I don't like it either. Just look the other way. :) | ||
|
||
for package in ('urllib3', 'idna', 'chardet'): | ||
for package in ('urllib3', 'idna'): | ||
locals()[package] = __import__(package) | ||
# This traversal is apparently necessary such that the identities are | ||
# preserved (requests.packages.urllib3.* is urllib3.*) | ||
for mod in list(sys.modules): | ||
if mod == package or mod.startswith(package + '.'): | ||
sys.modules['requests.packages.' + mod] = sys.modules[mod] | ||
|
||
target = chardet.__name__ | ||
for mod in list(sys.modules): | ||
if mod == target or mod.startswith(target + '.'): | ||
sys.modules['requests.packages.' + target.replace(target, 'chardet')] = sys.modules[mod] | ||
# Kinda cool, though, right? |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,18 @@ | ||
[tox] | ||
envlist = py27,py35,py36,py37,py38 | ||
envlist = py{27,35,36,37,38}-{default,use_chardet_on_py3} | ||
|
||
[testenv] | ||
|
||
deps = -rrequirements-dev.txt | ||
extras = | ||
security | ||
socks | ||
commands = | ||
python setup.py test | ||
pytest tests | ||
|
||
[testenv:default] | ||
|
||
[testenv:use_chardet_on_py3] | ||
extras = | ||
security | ||
socks | ||
use_chardet_on_py3 |