-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Windows] Could not install packages due to an EnvironmentError: [Errno 42] Illegal byte sequence #5665
Comments
pip/src/pip/_internal/utils/ui.py Line 272 in e818039
So "ascii" is the Unicode IncrementalBar ... 😔 That's another bug, no?
Anyhow, this doesn't happen when you just print the Unicode character, in fact it works in the Python interpreter. It seems to be triggered by calling #!/usr/bin/env python2
from __future__ import print_function
import locale
locale.setlocale(locale.LC_ALL, '')
print(u' |\u2588') WAT |
Found while debugging pypa#5665
It seems I have stumbled upon "FUN"... 😢 Apparently, msvcrt does charset conversion when writing to its file descriptors based on the set locale! and it's even special cased to handle the OEM console code page (You can see this in When the "C" locale is set, no conversion is done. Python encodes to the OEM code page, and it passes through to the console unscathed. But once you do
Later I found this describing this: Why printf can display non-ASCII characters when “C” locale is used?. |
Some choice possible workarounds (There may be other/better ways), if any of them looks acceptable I can try to make a PR: import ctypes
# Not accessible you say? 😈
if sys.platform == "win32" and six.PY2:
ctypes.pythonapi.PyFile_SetEncoding(ctypes.py_object(sys.stdin), locale.getpreferredencoding(False))
ctypes.pythonapi.PyFile_SetEncoding(ctypes.py_object(sys.stdout), locale.getpreferredencoding(False))
ctypes.pythonapi.PyFile_SetEncoding(ctypes.py_object(sys.stderr), locale.getpreferredencoding(False)) # In _select_progress_class. This makes the check for the characters use the correct
# encoding. Actually writing anything will still use the wrong encoding of course
# We can also just skip the check and use ascii always if this condition holds
if sys.platform == "win32" and six.PY2:
encoding = locale.getpreferredencoding(False) # This should actually make the Unicode chars work, an external library that would need to be vendored and initialized before colorama.
if sys.version_info < (3, 6):
import win_unicode_console
win_unicode_console.enable() |
So you're saying this is a core Python issue that the Python core devs aren't willing to fix? (FWIW, as a Python core dev myself, I'm in agreement with Steve on the matter). If so, I don't see that there's any point in pip trying to work around it. Does #5671 fix the issue (at least in cases where the user hasn't overridden the default style)? If so, that seems to be a sufficient workaround. I'm -1 on the various workarounds you suggest. Of them all, the only one I'd remotely consider is the use of |
Essentially yes. In my opinion, It's a core issue in the way Python 2.7 (and below, most likely) handle the console code page if you call #5671 doesn't fix this. It only fixes the option for the user to override pip's automatic choice and use an ASCII only progress bar. It will still be a required to use it manually, and placing it in a config file will have it take effect for Python 3 too, which the user likely doesn't want. Without implementing a workaround, pip will be broken by default for many new users on Python 2.7 on Windows (Ones who use code pages which trigger this, I'm not sure it happens with all of them), which I don't think is particularly nice. Things should really work out of the box. Users shouldn't need to apply workarounds themselves to get things to work with supported Python versions. Python 2.7 is still supported. For now... 😉 |
One workaround that worked for me: use |
|
I also get this error on an updated Windows 10 with pip 18.0 and 18.1, didn't have it in 9.0.1 or 10.0.1.. I use |
pip install <package_name> -q works around it for me. |
Thanks to @segevfiner for fixing Other than that, I'm not sure it'd be worth anyone's time to fix the unicode bug/problem that's Python 2 only. A better option might just be to detect "invalid byte sequence" and tell the user in the error message, that they might want to use |
pip 19.2 has released with the fix for the progress-bar issue -- ascii is now really ascii. :) We've also documented how Python 2 support will be maintained going forward at: https://pip.pypa.io/en/latest/development/release-process/#python-2-support |
This issue is marked as "python 2 only". pip 21.0 dropped support for Python 2. Should this be closed? |
I believe so. It is indeed a Python 2 only issue. |
Thanks for the heads up! |
Environment
cp872
.consolas
.Description
pip throws an exception when trying to display a progress bar for the download of any package on Windows.
Expected behavior
pip should work and install the package correctly.
How to Reproduce
Note that this is environment dependent.
pip install -I --no-cache-dir --verbose wheel
Using
--progress-bar ascii
doesn't help.--progress-bar off
works, but is sub-optimal as you get no progress.Analysis
The following seems to succeed, while later attempts to actually write the characters to the console fail. 😖
pip/src/pip/_internal/utils/ui.py
Lines 52 to 60 in 3e81d8c
It will work on Python 3 since it has unicode support for the console. There is win-unicode-console which could be used to get similar support for Python 2. But the check should probably be fixed anyhow.
Output
P.S. The second exception is an entirely different bug... This happens when you use
--no-cache-dir
. Not sure if reported, if not, then a separate issue needs to be opened for that.The text was updated successfully, but these errors were encountered: