Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

403 Client Error: Forbidden for url: https://readthedocs.org/api/v3/projects/?limit=100 #7

Closed
joverlee521 opened this issue Nov 27, 2024 · 17 comments · Fixed by #9
Closed
Labels
bug Something isn't working

Comments

@joverlee521
Copy link
Contributor

joverlee521 commented Nov 27, 2024

Context

First seen in https://github.com/nextstrain/docs.nextstrain.org/actions/runs/12056254066/job/33618372098?pr=238

Run rtd projects "nextstrain" redirects sync -f "redirects.yml" --dry-run
Traceback (most recent call last):
  File "/home/runner/.local/bin/rtd", line 8, in <module>
    sys.exit(rtd())
  File "/usr/lib/python3/dist-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/lib/python3/dist-packages/click/core.py", line 1656, in invoke
    super().invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/runner/.local/lib/python3.10/site-packages/readthedocs_cli/__init__.py", line 80, in rtd_projects
    projects = api.v3.projects()
  File "/home/runner/.local/lib/python3.10/site-packages/readthedocs_cli/api/v3.py", line 20, in projects
    return GET("projects/", {"limit": 100})
  File "/home/runner/.local/lib/python3.10/site-packages/readthedocs_cli/api/v3.py", line 45, in GET
    res.raise_for_status()
  File "/usr/lib/python3/dist-packages/requests/models.py", line 943, in raise_for_status
    raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://readthedocs.org/api/v3/projects/?limit=100

However, I cannot reproduce locally, so it must be specific to the GH Action environment.

Possible solution

I was able to resolve a 403 Client Error in the docs build by adding the a User-Agent to the request session (nextstrain/docs.nextstrain.org@c561b61), but unclear to me if this is the same issue.

@joverlee521 joverlee521 added the bug Something isn't working label Nov 27, 2024
@joverlee521
Copy link
Contributor Author

joverlee521 commented Nov 27, 2024

Hrmm. I added the User-Agent, updated the sync-redirects.yaml in docs.nextstrain.org, and the test run still failed with the 403 Client Error.

Will revisit this later next week

@dmundra
Copy link

dmundra commented Dec 2, 2024

I am getting this as well. Probably related to readthedocs/readthedocs.org#11763 (which you did find in the PR)

@dmundra
Copy link

dmundra commented Dec 2, 2024

How to add the user agent to the read the docs CLI?

@tsibley
Copy link
Member

tsibley commented Dec 3, 2024

The restriction on request's default User-Agent was a change for GitHub URLs. This 403 is from Read The Docs, not GitHub, so I suspect it is unrelated to User-Agent.

@dmundra
Copy link

dmundra commented Dec 3, 2024

Cannot reproduce it locally so I wonder what the error is. Is there a way to add more logging?

@joverlee521
Copy link
Contributor Author

I'm testing just using requests to ping "https://readthedocs.org/api/v3/" in GitHub Actions.

It returns 403 unless I specifically installed urllib3==2.2.3 (successful run). This led me to find urllib3/urllib3#2915, which makes me believe we need urllib3>=2.0.0.

@joverlee521
Copy link
Contributor Author

joverlee521 commented Dec 4, 2024

Pinned urllib3>=2.0.0 in a8771b1 and test dry run was successful

(Although includes a requests dependency warning, so we might need to pin requests as well?)

@dmundra
Copy link

dmundra commented Dec 4, 2024

RTD support staff pointed me to this readthedocs/readthedocs.org#11753 as a potential issue

@dmundra
Copy link

dmundra commented Dec 4, 2024

Support staff also shared this

The problem we see there is that some requests are more likely to fail Cloudflare's browser integrity check. If you can confirm any of this, we might be able to figure out how to avoid the check for these requests. You may need to alter the request somehow -- sorry I don't have concrete solutions here, the information on the check is a little limited.

@tsibley
Copy link
Member

tsibley commented Dec 4, 2024

Thanks for that info, @dmundra. Pretty disappointed that they'd implement checks they don't understand. I guess we get to throw stuff at the wall and see what sticks.

@tsibley
Copy link
Member

tsibley commented Dec 4, 2024

I wonder if the urllib3 upgrade @joverlee521 found to work is because Cloudflare's "browser integrity checks" include things like TLS cipher suites advertised by the client (us) and/or OpenSSL version and/or some other fingerprinting.

joverlee521 added a commit that referenced this issue Dec 4, 2024
It's not entirely clear _why_ this works, but using urllib3>=2.0.0
avoids the 403 Client error.¹

Adding minimum version for `requests` the first version that officially
supports urllib3 v2.²

¹ <#7 (comment)>
² <https://github.com/psf/requests/releases/tag/v2.30.0>
@tsibley
Copy link
Member

tsibley commented Dec 4, 2024

We're getting an old version of urllib3 because we're getting an old version of requests because we're using the GitHub Actions runner's system Python which comes pre-installed with requests 2.25.1 so pip doesn't bother upgrading it.

I can reproduce the 403 locally by using that same requests version (it pulls in an old urllib3 version). Using a newer requests version pulls in a newer urllib3 and the 403 goes away. This is all using OpenSSL 3.0.13 locally.

I captured a failing request and successful request as generated by rtd projects nextstrain and diffed the initial TLS exchange to spot differences. Sure enough, the successful request does not offer a bunch of older/insecure TLS ciphers. And that's exactly what urllib3/urllib3#2915 (identified by @joverlee521 above) changes: it uses the system's default ciphers instead of its own list + drops TLS 1.0 and 1.1 and their insecure ciphers.

I also decrypted the HTTP response for the failing request and sure enough, it's a Cloudflare "browser challenge" page that requires JS to execute.

joverlee521 added a commit that referenced this issue Dec 4, 2024
Using urllib3>=2.0.0 to avoid the 403 Client error.¹

Adding minimum version for `requests` the first version that officially
supports urllib3 v2.²

See @tsibley's comment on _why_ upgrading `urllib3` resolved the 403
Client error.³

¹ <#7 (comment)>
² <https://github.com/psf/requests/releases/tag/v2.30.0>
³ <#7 (comment)>
@dmundra
Copy link

dmundra commented Dec 4, 2024

Thank you @tsibley and @joverlee521. Will there be a new release for the cli for this?

@tsibley
Copy link
Member

tsibley commented Dec 4, 2024

@dmundra Should be!

@joverlee521 I'm happy to cut a new release if you're not, or happy to let you do it if you'd like.

@joverlee521
Copy link
Contributor Author

@tsibley I just pushed up the v5 tag, but I don't have twine/PyPi set up. Could you build and push?

@tsibley
Copy link
Member

tsibley commented Dec 4, 2024

@joverlee521 Done!

@dmundra This is released in version 5, available on PyPI as of now.

@dmundra
Copy link

dmundra commented Dec 4, 2024

Awesome. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants