This repository has been archived by the owner on Mar 12, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 28
create_merge_pr_job fails transiently #687
Labels
pipeline-failure
Related to deployment pipeline failures.
Comments
Noting how often this has been happening:
|
More comprehensive occurrence information, from looking at the job history in GoCD:
|
timmc-edx
added a commit
that referenced
this issue
Aug 21, 2023
Try to fix #687 by allowing retries on `delete_branch`. This may prevent the issue we've seen in `create_private_to_public_pr.py` where the deletion of a newly created branch occasionally fails with a 404.
This might fix it, and if it doesn't then we should at least get more information: #691 (allows retries on deleting a branch when we get a 404 for that) |
If we see another failure, we should also try seeing if the branch was successfully pushed to GH before we re-run the job. |
Closing on the presumption that it worked; can reopen if that's not the case. |
@timmc-edx: Related comments regarding our private runbook for GoCD:
|
Updated runbook. Not sure what we want to do re: status:Done, but at least it's in a comment there now so that the next person consulting the runbook will consider that. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
AC
Implementation Details:
Between July 26 and Aug 2 2023, the
create_merge_pr_job
of theedxapp_private_public_merge_sync
GoCD pipeline has failed with the same unclear error at least 5 times. On re-run, it has passed.We will likely need to add debug logging, especially of the API response, stack traces, and git state. It might also work to just add retries, if we end up not making progress on this and just want to put a band-aid on it.
Notes
The script backing this job is create_private_to_public_pr.py in the tubular repo.
This job is supposed to convey any merges in edx-platform-private into the public. (The following jobs then convey public changes into the private repo.) However, no such merges have happened for quite some time, as we are following a new process that involves GitHub Security Advisories instead. One interesting bit of timing, though: A private PR was closed (not merged) on July 25. It is unclear why this would have any effect, though.
Here's an example of a failing run:
The traceback includes chained exceptions, with the last one being caused by the
delete_branch
call:github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/git/refs#get-all-references-in-a-namespace"}
Here's an example of a subsequent passing run, showing a different error that nonetheless does not cause a job failure:
This may indicate some kind of race condition with GitHub and branch creation.
The text was updated successfully, but these errors were encountered: