Inscrutable "Actions workflow run is stale" error #234

blast-hardcheese · 2021-02-25T05:10:57Z

I'm getting a lot of sporadic failures in reporting, possibly due to the number of parallel builds that are attempting to submit coverage reports.

The way my project is configured is to build the core tests, which takes about four minutes, then builds over twenty other integration tests, each of which takes five or more minutes. It seems as though we may be dancing right on the edge of some sort of limit, possibly due to my naive understanding of the after_n_builds option.

Unfortunately, Googling anything about {'detail': ErrorDetail(string='Actions workflow run is stale', code='not_found')} turns up nothing, so hopefully after now, at least people will find this issue.

Would you kindly explain how to either increase the timeout for when codecov is waiting for coverage segments, or if this is not the case, instruct on how to resolve this error?

Thank you for your assistance, as well as for an excellent product!

==> Uploading reports
    url: https://codecov.io
    query: branch=update%2Fjackson-core-2.12.1&commit=5e16535e81483a6a07612ba10cfe32c328469103&build=598338763&build_url=http%3A%2F%2Fgithub.com%2Ftwilio%2Fguardrail%2Factions%2Fruns%2F598338763&name=&tag=&slug=twilio%2Fguardrail&service=github-actions&flags=&pr=927&job=CI&cmd_args=n,F,Q,Z,f
->  Pinging Codecov
https://codecov.io/upload/v4?package=github-action-20210129-7c25fce&token=secret&branch=update%2Fjackson-core-2.12.1&commit=5e16535e81483a6a07612ba10cfe32c328469103&build=598338763&build_url=http%3A%2F%2Fgithub.com%2Ftwilio%2Fguardrail%2Factions%2Fruns%2F598338763&name=&tag=&slug=twilio%2Fguardrail&service=github-actions&flags=&pr=927&job=CI&cmd_args=n,F,Q,Z,f
{'detail': ErrorDetail(string='Actions workflow run is stale', code='not_found')}
404
==> Uploading to Codecov
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  182k  100    81  100  182k    400   904k --:--:-- --:--:-- --:--:--  904k
    {'detail': ErrorDetail(string='Actions workflow run is stale', code='not_found')}
Error: Codecov failed with the following error: The process '/usr/bin/bash' failed with exit code 1

The text was updated successfully, but these errors were encountered:

thomasrockhu · 2021-02-26T15:51:46Z

Hi @blast-hardcheese, we are working to understand the issue here, but I think for now as a workaround, you can supply the Codecov upload token. Do you have a GitHub Actions CI link that we can take a look at btw?

blast-hardcheese · 2021-02-26T18:59:04Z

Do you have a GitHub Actions CI link that we can take a look at btw?

Sure -- you can take a look at many of the recent failures on https://github.com/guardrail-dev/guardrail/ , one example is https://github.com/guardrail-dev/guardrail/pull/1000/checks?check_run_id=1976440163 .

I've just been re-running all the checks and usually the subsequent run is successful.

blast-hardcheese · 2021-02-26T19:02:40Z

Additionally, I've moved this repo out from where it was previously hosted, https://github.com/twilio/guardrail/ , within the past 24 hours -- that may impact your investigation. If you need more samples from after the repo was moved over, I can submit them as they come in -- library upgrade PRs are the most likely to trigger this, due to the rate of submission.

blast-hardcheese · 2021-02-27T17:39:11Z

More recent example after moving the repo to a new org and re-authorizing: https://github.com/guardrail-dev/guardrail/pull/1004/checks?sha=ff99a5dfa20d69e2f8519ca7d6569f5a6ebb63a8

thomasrockhu · 2021-03-02T00:45:06Z

@blast-hardcheese, unless I'm missing something, I couldn't find the above error in that latest link. Apologies if it's really blatant and I missed it, but would you mind sharing the name of the job that failed?

blast-hardcheese · 2021-03-02T05:21:43Z

@thomasrockhu Ack! I didn't realize that re-running the workflow erased the failure, I thought links were stable.

I was able to reproduce the error on an already merged PR, so this should not change:

https://github.com/guardrail-dev/guardrail/pull/1004/checks?check_run_id=2009728188

Sorry about that!

blast-hardcheese · 2021-03-02T21:03:57Z

I don't know if this is related, but if this is a race condition, it very well may be -- we're also experiencing the exact opposite problem, where we successfully report all after_n_builds runs (22 runs) asynchronously to codecov.io for a PR, but the callback never fires, so we never get a response to the required codecov build phase.

A normal run looks like this:

in this example, it was just hung like this (I've since merged the PR, but you can still see that Codecov is not in the reported checks for that PR, meaning the callback didn't fire):

thomasrockhu · 2021-03-03T14:52:30Z

@blast-hardcheese, I think I resolved most of the Actions workflow is stale. Let me know if that's not the case

As for the most recent example, it didn't fire because we had only received 16 builds (and not 22). It's a little challenging to see which build didn't upload properly, do you happen to know the names of the jobs?

thomasrockhu · 2021-03-03T14:54:40Z

In that particular example, it looks like some/all of the Scala 15 builds didn't run tests or try to upload to Codecov

Yoshanuikabundi · 2021-03-11T07:10:25Z

Hi! Let me know if I should open a new issue for this, but we're having an identical problem. We're planning on reducing the size of our testing matrix in the near future, will this alleviate the problem? Otherwise if you could take a look that'd be great! Thanks :)

blast-hardcheese · 2021-03-16T05:03:23Z

In that particular example, it looks like some/all of the Scala 15 builds didn't run tests or try to upload to Codecov

You're completely correct. I didn't realize that I had excluded some coverage uploads while also using after_n_builds -- sorry for confusing the issue here.

I haven't seen the Actions workflow run is stale error for more than a week at this point, so may I ask what you did on your end? Is this something I could have done via the codecov UI somehow, and is there a possibility of this resurfacing? I've noticed some other 👍s on the initial issue, so presumably others are running into this as well

blast-hardcheese · 2021-03-16T05:11:42Z

(Also, thank you again for all your help here!)

laurynas-biveinis · 2021-03-16T06:20:31Z

FWIW I have been running into this as recently as yesterday in my project too - https://github.com/laurynas-biveinis/unodb/runs/2109776375?check_suite_focus=true

In my case there are two flag-separated configurations, which get uploaded in parallel. Perhaps they should be serialized?

thomasrockhu · 2021-03-21T19:21:26Z

@laurynas-biveinis I'm looking into making a patch for this. We should hopefully have that particular edge case fixed this week.

briansmith · 2021-03-23T07:03:31Z

I was having this problem and I found adding the Codecov token as a GitHub Actions secret helped. However, I'm now getting this error on every merge to my main branch, after the jobs for the same commit on its feature branch (pre-merge) succeeds.

briansmith · 2021-03-23T07:05:28Z

Here's my log of the failure: https://github.com/briansmith/ring/runs/2172862556?check_suite_focus=true

ChristophWurst · 2021-03-23T07:39:26Z

I was having this problem and I found adding the Codecov token as a GitHub Actions secret helped.

Unfortunately for any github organization with a wider community this imposes a potential leakage of an access token, hence we at Nextcloud dropped our codecov tokens from the action because the readme says those are not required for public repositories.

Our current mitigation is to report coverage only for a few CI runs, though that can potentially lower the reported coverage as some paths are only triggered by certain tests in our matrix.

briansmith · 2021-04-29T22:05:16Z

I was having this problem and I found adding the Codecov token as a GitHub Actions secret helped. However, I'm now getting this error on every merge to my main branch, after the jobs for the same commit on its feature branch (pre-merge) succeeds.

I was mistaken. Although I did start the process of adding a Codecov token as a secret within my GitHub Actions workflow, I never got around to hooking it up to my use of this action, so it was never used. Thus it had no effect. It seems like Codecov must have addressed the issue here on its end.

In issue #300 I suggest a different solution that doesn't require using a Codecov access token: Move the uploading of coverage from the jobs that collect the coverage. If you have only one job that submits coverage data to codecov then you can avoid the timeout issue described above, AFAICT, and you can also properly minimize permissions on the GitHub token. You'd need to upload the coverage data as an artifact in each job that collects coverage information, and then download those artifacts in the job that submits the coverage information, and then use "needs:" to tell GitHub Actions about the dependency between the jobs.

thomasrockhu-codecov · 2023-02-28T15:11:15Z

Closing as this no longer seems to be an issue.

Yoshanuikabundi mentioned this issue Mar 11, 2021

Document how to build the docs openforcefield/openff-toolkit#863

Merged

2 tasks

This was referenced Mar 15, 2021

Bump @ckeditor/ckeditor5-dev-webpack-plugin from 24.4.0 to 24.4.1 nextcloud/mail#4757

Merged

Bump mocha from 8.3.1 to 8.3.2 nextcloud/mail#4756

Merged

ChristophWurst mentioned this issue Mar 18, 2021

Add tagging to messages nextcloud/mail#4665

Merged

lkeegan mentioned this issue Mar 20, 2021

Add is_polynomial function symengine/symengine#1747

Merged

ChristophWurst mentioned this issue Mar 22, 2021

Report coverage less often nextcloud/mail#4803

Merged

trsvchn mentioned this issue Mar 22, 2021

Add bleu metric pytorch/ignite#1834

Merged

3 tasks

ChristophWurst mentioned this issue Mar 23, 2021

Bump stylelint from 13.11.0 to 13.12.0 nextcloud/calendar#2950

Merged

briansmith mentioned this issue Apr 29, 2021

Document how to separate codecov-action use into a separate job #300

Closed

thomasrockhu-codecov closed this as completed Feb 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inscrutable "Actions workflow run is stale" error #234

Inscrutable "Actions workflow run is stale" error #234

blast-hardcheese commented Feb 25, 2021

thomasrockhu commented Feb 26, 2021

blast-hardcheese commented Feb 26, 2021

blast-hardcheese commented Feb 26, 2021

blast-hardcheese commented Feb 27, 2021

thomasrockhu commented Mar 2, 2021

blast-hardcheese commented Mar 2, 2021

blast-hardcheese commented Mar 2, 2021 •

edited

Loading

thomasrockhu commented Mar 3, 2021

thomasrockhu commented Mar 3, 2021

Yoshanuikabundi commented Mar 11, 2021

blast-hardcheese commented Mar 16, 2021

blast-hardcheese commented Mar 16, 2021

laurynas-biveinis commented Mar 16, 2021

thomasrockhu commented Mar 21, 2021 •

edited

Loading

briansmith commented Mar 23, 2021

briansmith commented Mar 23, 2021

ChristophWurst commented Mar 23, 2021

briansmith commented Apr 29, 2021

thomasrockhu-codecov commented Feb 28, 2023

Inscrutable "Actions workflow run is stale" error #234

Inscrutable "Actions workflow run is stale" error #234

Comments

blast-hardcheese commented Feb 25, 2021

thomasrockhu commented Feb 26, 2021

blast-hardcheese commented Feb 26, 2021

blast-hardcheese commented Feb 26, 2021

blast-hardcheese commented Feb 27, 2021

thomasrockhu commented Mar 2, 2021

blast-hardcheese commented Mar 2, 2021

blast-hardcheese commented Mar 2, 2021 • edited Loading

thomasrockhu commented Mar 3, 2021

thomasrockhu commented Mar 3, 2021

Yoshanuikabundi commented Mar 11, 2021

blast-hardcheese commented Mar 16, 2021

blast-hardcheese commented Mar 16, 2021

laurynas-biveinis commented Mar 16, 2021

thomasrockhu commented Mar 21, 2021 • edited Loading

briansmith commented Mar 23, 2021

briansmith commented Mar 23, 2021

ChristophWurst commented Mar 23, 2021

briansmith commented Apr 29, 2021

thomasrockhu-codecov commented Feb 28, 2023

blast-hardcheese commented Mar 2, 2021 •

edited

Loading

thomasrockhu commented Mar 21, 2021 •

edited

Loading