-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support unicode characters in authentication header #212
Comments
There was a similar issue raised against requests (https://github.com/kennethreitz/requests/issues/1926) 2 months ago. The important part from that is: @Lukasa's comment. In short: RFC 2616 only allows for characters in the Latin-1 encoding, if you want to pass unicode characters as part of a header value there are two options:
In short, this is not actually a bug in the implementation as we are being 100% compliant with the RFC. |
Actually, there's a third option that If anyone's interested in contributing to this effort, please continue the discussion over there. |
Hm, and what about simply using UTF8? That seems to work for Opera. http://stackoverflow.com/questions/702629/utf-8-characters-mangled-in-http-basic-auth-username // Btw, thank you @sigmavirus24 for so often providing useful upstream context for HTTPie issues. It's very helpful 👍 |
Opera does it, but no-one else does. From the same question:
The real fix here was pointed out in IRC, which is this draft RFC coming out of the HTTPbis. When this draft becomes a standard, I'll happily implement support for it in requests. |
@jkbr I agree. There is another user-agent that allows you to use UTF-8 (as can be discovered in the requests issue I linked): cURL. The problem as I see it is that if you just read the introduction to the draft RFC that @Lukasa linked, this is not really universally supported behaviour. cURL does the following:
If you decode the parameter (using Python's |
@sigmavirus24 It looks like using UTF-8 & printing a warning message is the most pragmatic way to go. HTTPie users are likely to have previously used cURL. |
@jkbr I'm afraid that likely will not work: >>> auth
('foobar', 'abc\xc3\xb62')
>>> ('%s:%s' % auth).encode('latin1')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 10: ordinal not in range(128) That's roughly what >>> auth
('foobar', 'abc\xc3\xb62')
>>> base64.b64encode('%s:%s' % auth)
'Zm9vYmFyOmFiY8O2Mg==' Given the vagueness of the specification around the basic authentication header, I wonder if the username/password actually have to be |
@sigmavirus24 We've already covered this in this discussion repeatedly: the specification provides no guidance as to text encoding because it was written by Americans at a time where text encoding was not a concern. The only thing that's safe is There is no "have to" here. Requests can absolutely decide to use UTF-8 if we wanted to, but I guarantee we'll break someone's running code where their webserver assumes that they'll be getting ISO 8859-1 but now start getting multibyte sequences from UTF-8. Requests has made a choice and I'm pretty happy with it at the moment. Users such as |
@jkbr looks like you have your solution above then ;) |
* Immediatelly convert all args from `bytes` to `str`. * Added `Environment.stdin_encoding` and `Environment.stdout_encoding` * Allow unicode characters in HTTP headers and basic auth credentials by encoding them using UTF8 instead of latin1 (#212).
It turns out |
If a unicode char (here for example german umlaut ö = 0xc3), is part of the authentication header an error is thrown:
I am using Python 2.7.3 on a plain Ubuntu 12.04.4 LTS system.
The text was updated successfully, but these errors were encountered: