Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip fails without error message on a particular package when chcp 65001 (UTF-8) is set on Windows 7 RUS #5878

Closed
kiwi0fruit opened this issue Oct 12, 2018 · 21 comments
Labels
auto-locked Outdated issues that have been locked by automation C: encoding Related to text encoding and likely, UnicodeErrors OS: windows Windows specific type: bug A confirmed bug or unintended behavior

Comments

@kiwi0fruit
Copy link

Environment

  • pip version: 10.0.1 (anaconda defaults) or 18.1 (conda-forge)
  • Python version: 3.6.6 (Miniconda x64)
  • OS: Windows 7 x64 RUS (Russian)

Description

pip fails without error message on a particular package when chcp 65001 (UTF-8) is set on Windows 7 RUS

Bug happens when using pip in Anaconda on Windows 7 RUS with chcp 65001 set (UTF-8) - when I try to install pip install pandoctools (but everything is OK when chcp 65001 is not set though).

After pip install pandoctools pip downloads the package then exits without any message and changes color of the console text.

  • Using pip 10.0.1 lead to text changing to yellow,
  • Using pip 18.1 lead to text changing to red.

Interesting that on Windows 7 x64 ENG I didn't have such behaviour: I got some Unicode error that dissappeared when I updated python from 3.6.5 to 3.6.6. But updating python didn't help on Windows 7 x64 RUS.

How to Reproduce

  1. Install latest Miniconda 3.6 on Windows 7 RUS
  2. Create and activate environment in terminal
  3. type chcp 65001 in terminal
  4. pip install pandoctools

No output. Only font changing.

@kiwi0fruit
Copy link
Author

kiwi0fruit commented Oct 12, 2018

UPD

  • If I download the package from pypi and install it from local disk it installs without problems.
  • The package is sdist generated.
  • Also there is another package with almost the same setup.py: sugartex. And it installs OK.
  • pip download pandoctools also fails the same way as pip install

This whole bug is rather strange...

Output of pip install -v pandoctools (on pip 10.0.1, 18.1 looks the same):

(research) C:\Users\User>pip install -v "pandoctools==0.4.17"
Config variable 'Py_DEBUG' is unset, Python ABI tag may be incorrect
Config variable 'WITH_PYMALLOC' is unset, Python ABI tag may be incorrect
Created temporary directory: C:\Users\User\AppData\Local\Temp\pip-ephem-wheel-ca
che-6m22r9i5
Created temporary directory: C:\Users\User\AppData\Local\Temp\pip-install-rk4uex
8r
Collecting pandoctools==0.4.17
  1 location(s) to search for versions of pandoctools:
  * https://pypi.org/simple/pandoctools/
  Getting page https://pypi.org/simple/pandoctools/
  Looking up "https://pypi.org/simple/pandoctools/" in the cache
  Current age based on date: 154
  Freshness lifetime from max-age: 600
  Freshness lifetime from request max-age: 600
  The response is "fresh", returning cached response
  600 > 154
  Analyzing links from page https://pypi.org/simple/pandoctools/
    Found link https://files.pythonhosted.org/packages/57/72/8057720308f8262ffa4
1710a5bd8e642d3970113a87241a10da0ae86393a/pandoctools-0.4.17.tar.gz#sha256=45c36
89218e859f231980470060186de8c9544a2790f68edc56b92f315a4058f (from https://pypi.o
rg/simple/pandoctools/), version: 0.4.17
  Using version 0.4.17 (newest of versions: 0.4.17)
  Created temporary directory: C:\Users\User\AppData\Local\Temp\pip-unpack-66npv
cv1
  Looking up "https://files.pythonhosted.org/packages/57/72/8057720308f8262ffa41
710a5bd8e642d3970113a87241a10da0ae86393a/pandoctools-0.4.17.tar.gz" in the cache

  Ignoring unknown cache-control directive:
  No cache entry available
  Starting new HTTPS connection (1): files.pythonhosted.org
  https://files.pythonhosted.org:443 "GET /packages/57/72/8057720308f8262ffa4171
0a5bd8e642d3970113a87241a10da0ae86393a/pandoctools-0.4.17.tar.gz HTTP/1.1" 200 7
1584
  Downloading https://files.pythonhosted.org/packages/57/72/8057720308f8262ffa41
710a5bd8e642d3970113a87241a10da0ae86393a/pandoctools-0.4.17.tar.gz (71kB)
  Downloading from URL https://files.pythonhosted.org/packages/57/72/8057720308f
8262ffa41710a5bd8e642d3970113a87241a10da0ae86393a/pandoctools-0.4.17.tar.gz#sha2
56=45c3689218e859f231980470060186de8c9544a2790f68edc56b92f315a4058f (from https:
//pypi.org/simple/pandoctools/)

@kiwi0fruit
Copy link
Author

I guess this means that pip (and twine) UTF-8 support on Windows is buggy.

By the way: twine also doesn't work with chcp 65001.

@uranusjr
Copy link
Member

Does pip work if you supply --progress-bar off? I suspect the downloading progress bar (non-ASCII) could be the culprit. (See #5671)

@kiwi0fruit
Copy link
Author

Today I wanted to test the issue with --progress-bar off but the issue itself dissappeared. I didn't change python or modules. So it was either Windows 7 update or update on pypi (or something else but it's ulikely).

@kiwi0fruit
Copy link
Author

@mareknino95 do you still have an issue?

@kiwi0fruit
Copy link
Author

C:\Users\User>pip install pandoctools
Collecting pandoctools
  Downloading https://files.pythonhosted.org/packages/57/72/8057720308f8262ffa41
710a5bd8e642d3970113a87241a10da0ae86393a/pandoctools-0.4.17.tar.gz (71kB)
Could not install packages due to an EnvironmentError: [WinError 31] A device attached to the system is not functioning
Consider using the `--user` option or check the permissions.

You are using pip version 10.0.1, however version 18.1 is available.
You should consider upgrading via the 'python -m pip install --upgrade pip' command.
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>
PermissionError: [WinError 31] A device attached to the system is not functioning

After the error dissappeared from Windows 7 RUS it re-appeared on Windows 7 ENG.

@kiwi0fruit
Copy link
Author

kiwi0fruit commented Oct 17, 2018

On Windows 7 ENG: after chcp 1252 and again chcp 65001 the bug dissappered and didn't re-appear after Windows restart.

UPD: Then re-appeared the next day...

It's rather dejecting bug...

@kiwi0fruit
Copy link
Author

kiwi0fruit commented Oct 20, 2018

@uranusjr I confirm that --progress-bar off removes bug on Windows 7 ENG.

@kiwi0fruit
Copy link
Author

@uranusjr I confirm that --progress-bar off also removes bug on Windows 7 RUS.

@kiwi0fruit
Copy link
Author

@uranusjr But if non-unicode progress bar is at fault why the bug happen in console switched to UTF-8 and doesn't appear in default ASCII console?

@uranusjr
Copy link
Member

This is most definitely the same issue then. The progress bar contains non-ASCII characters, and causes decoding error if the console uses certain code pages. #5671 makes the progress bar fully ASCII-compatible, and should fix this problem.

@kiwi0fruit
Copy link
Author

@uranusjr It may also mean that DefaultDownloadProgressBar from here doesn't work well with widows console with chcp 65001 for some reason.

@uranusjr
Copy link
Member

That is a good question, and I don’t know the answer :( Windows code pages are known to be difficult to handle. It could be some problem caused by interfacing of Python IO and Windows API, I am not extremely knowledgable about Windows console encoding, only know that mysterious things happen all the time when you try to feed non-ASCII things to it.

@kiwi0fruit
Copy link
Author

I guess the simple solution of using = for progress and not showing off would be the best.

@pfmoore
Copy link
Member

pfmoore commented Oct 24, 2018

Windows code pages are known to be difficult to handle.

I've hesitated to say this, as I don't have any direct evidence, but my understanding is that chcp 65001 (which is in theory UTF-8) has some weird misbehaviours, and is not in general recommended. So that may be what's triggering these issues. Can they be demonstrated with any other codepage?

In my experience, other codepages behave well enough - most Unicode problems on Windows (outside of chcp 65001) tend to boil down to programs not keeping careful enough track of which encoding is used in different places (because that's messy, with multiple codepages potentially being involved).

@kiwi0fruit
Copy link
Author

But chcp 65001 is the only UTF-8 support in console we have. Or there is another way to have UTF-8 in Windows console? For example I tested pandoc pipelining on Windows. And it only works with chcp 65001 (it doesn't track any codepages and always assumes that it gets UTF-8).

@uranusjr
Copy link
Member

No, there is no other way to have UTF-8 in a Windows console. 65001 is also special in the sense that it is the only variable-length encoding supported by Windows, and my understanding (not very well!) is that the Windows API has some weird cases regarding this, since it is not designed with it in mind.

This is drifting off topic, but I think your best bet would be to avoid console piping altogether. It is much more reliable to output to a (temporary) file, and read it into another process.

@kiwi0fruit
Copy link
Author

Yeah... Thanks all for new information!

@pfmoore
Copy link
Member

pfmoore commented Oct 24, 2018

Well, the console is natively Unicode, and under Python 3.7 (and 3.6, IIRC) Python can print Unicode direct to sys.stdout without needing any codepage games. But (a) that doesn't help with older versions of Python, and (b) the progress bar library may well not take advantage of this (again, because it needs to support older versions).

As usual, backward compatibility is the real issue here - if we could ditch anything other than Windows 10 and Python 3.7+, the code we'd need would be a lot cleaner and less error prone :-)

@pfmoore
Copy link
Member

pfmoore commented Oct 24, 2018

(Just as an example):

image

Windows 7, console code page 437, Python 3.6. Although if I do chcp 65001, the above still works fine for me, so it's not quite as simple as "it's codepage 65001's fault" 😉

@pradyunsg pradyunsg added the S: needs triage Issues/PRs that need to be triaged label Dec 14, 2018
@pradyunsg
Copy link
Member

#5671 fixed this.

@pradyunsg pradyunsg added C: encoding Related to text encoding and likely, UnicodeErrors OS: windows Windows specific type: bug A confirmed bug or unintended behavior labels May 18, 2020
@triage-new-issues triage-new-issues bot removed the S: needs triage Issues/PRs that need to be triaged label May 18, 2020
@lock lock bot added the auto-locked Outdated issues that have been locked by automation label Jun 24, 2020
@lock lock bot locked as resolved and limited conversation to collaborators Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
auto-locked Outdated issues that have been locked by automation C: encoding Related to text encoding and likely, UnicodeErrors OS: windows Windows specific type: bug A confirmed bug or unintended behavior
Projects
None yet
Development

No branches or pull requests

5 participants
@uranusjr @pfmoore @pradyunsg @kiwi0fruit and others