Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RTD] Client Error 403 #558

Closed
amotl opened this issue Dec 9, 2024 · 5 comments
Closed

[RTD] Client Error 403 #558

amotl opened this issue Dec 9, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@amotl
Copy link
Member

amotl commented Dec 9, 2024

Problem

On our Jenkins, the rebuild all docs on RTD job started failing two weeks ago.

fab -f fab/tasks.py --set=rtd_token={redacted} rtd_rebuild_all
Rebuilding all projects
slumber.exceptions.HttpClientError: Client Error 403: https://readthedocs.org/api/v3/projects/crate/versions/

Image

Evaluation

The program to rebuild all docs uses RTD's API heavily, as it iterates through all the projects and all their versions, and triggers a rebuild on each of them.

Observations

Up until recently, that regularly tripped the maximum number of concurrent builds at RTD, but those requests have been queued internally, and would run and resolve after a while. Now, however, this procedure yields a hard error.

Traceback (most recent call last):
  File "/path/to/rebuild_all_docs/env/lib/python3.8/site-packages/fabric/main.py", line 758, in main
    execute(
  File "/path/to/rebuild_all_docs/env/lib/python3.8/site-packages/fabric/tasks.py", line 427, in execute
    results['<local-only>'] = task.run(*args, **new_kwargs)
  File "/path/to/rebuild_all_docs/env/lib/python3.8/site-packages/fabric/tasks.py", line 174, in run
    return self.wrapped(*args, **kwargs)
  File "/path/to/rebuild_all_docs/fab/tasks.py", line 63, in rtd_rebuild_all
    rtd.rebuild_all(api)
  File "/path/to/rebuild_all_docs/fab/rtd.py", line 87, in rebuild_all
    rebuild_project(project, api)
  File "/path/to/rebuild_all_docs/fab/rtd.py", line 77, in rebuild_project
    results = api.projects(project_slug).versions.get(active=True).get("results", [])
  File "/path/to/rebuild_all_docs/env/lib/python3.8/site-packages/slumber/__init__.py", line 155, in get
    resp = self._request("GET", params=kwargs)
  File "/path/to/rebuild_all_docs/env/lib/python3.8/site-packages/slumber/__init__.py", line 101, in _request
    raise exception_class("Client Error %s: %s" % (resp.status_code, url), response=resp, content=resp.content)
slumber.exceptions.HttpClientError: Client Error 403: https://readthedocs.org/api/v3/projects/crate/versions/
Build step 'Execute shell' marked build as failure

Tip

Please note: It works well when invoked on an engineer's workstation, so it is most recently some rate limiting the Jenkins machine is flagged for.

Research

@BaurzhanSakhariev suggested:

Guessing this is GH getting hammered by AI bots, and restricting requests without agents, like the rest of us.
It looks like the issue is the lack of a user agent. When I updated the example to pass a user agent, it works.

Evaluation

The program that orchestrates the RTD API is indeed doing some amount of "hammering", and it also probably doesn't supply any user agent.

References

@amotl amotl added the bug Something isn't working label Dec 9, 2024
@amotl
Copy link
Member Author

amotl commented Dec 9, 2024

The program [...] also probably doesn't supply any user agent.

@BaurzhanSakhariev added relevant code, but the program is still failing after receiving the 403 response. We need to compensate and improve.

@amotl amotl changed the title [RTD] Client Error 403: https://readthedocs.org/api/v3/projects/crate/versions/ [RTD] Client Error 403 Dec 9, 2024
@amotl
Copy link
Member Author

amotl commented Dec 9, 2024

Others are observing the same, that the problem still occurs, even after adding the user agent header.

Hrmm. I added the User-Agent, but the test run still failed with the 403 Client Error.

@amotl
Copy link
Member Author

amotl commented Dec 9, 2024

omg.

I wonder if the urllib3 upgrade joverlee521 found to work is because Cloudflare's "browser integrity checks" include things like TLS cipher suites advertised by the client (us) and/or OpenSSL version and/or some other fingerprinting.

-- nextstrain/readthedocs-cli#7 (comment)

I can reproduce the 403 locally by using that same requests version (it pulls in an old urllib3 version). Using a newer requests version pulls in a newer urllib3 and the 403 goes away. This is all using OpenSSL 3.0.13 locally.

-- nextstrain/readthedocs-cli#7 (comment)

@amotl
Copy link
Member Author

amotl commented Dec 9, 2024

Implementing a fix according to the insights outlined above, there is a patch that we are also just probing.

@amotl
Copy link
Member Author

amotl commented Dec 9, 2024

Fixing it like this apparently worked well.

@amotl amotl closed this as completed Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant