Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: Increase timeout for test-install test suite. #4697

Closed
wants to merge 2 commits into from

Conversation

csadorf
Copy link
Contributor

@csadorf csadorf commented Feb 1, 2021

The current timeout of 35min is regularly exceeded resulting in test failures. This change increases it to 55min.

The current timeout of 35min is regularly exceeded resulting in test failures. This change increases it to 55min.
@codecov
Copy link

codecov bot commented Feb 1, 2021

Codecov Report

Merging #4697 (2b438ce) into develop (48fa475) will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #4697      +/-   ##
===========================================
+ Coverage    79.62%   79.63%   +0.01%     
===========================================
  Files          484      484              
  Lines        35831    35831              
===========================================
+ Hits         28527    28529       +2     
+ Misses        7304     7302       -2     
Flag Coverage Δ
django 73.87% <ø> (ø)
sqlalchemy 73.06% <ø> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiida/engine/daemon/client.py 75.65% <0.00%> (+1.04%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 48fa475...2b438ce. Read the comment docs.

@csadorf
Copy link
Contributor Author

csadorf commented Feb 1, 2021

There appears to be some issue with the pre-commit upstream. I'll just rebase and rerun this later today.

@chrisjsewell
Copy link
Member

Hopefully this helps, but this is not really the underlying cause of the issue. It is because we have too many test jobs, which ends up causing some to go very slow (or even hang)

@csadorf
Copy link
Contributor Author

csadorf commented Feb 1, 2021

Hopefully this helps, but this is not really the underlying cause of the issue. It is because we have too many test jobs, which ends up causing some to go very slow (or even hang)

This is primarily to address the failure of scheduled nightly runs which should not really be affected by our general workload.

@csadorf
Copy link
Contributor Author

csadorf commented Feb 1, 2021

As a related note, we could host our own runner which might alleviate the situation.

@chrisjsewell
Copy link
Member

chrisjsewell commented Feb 1, 2021

which should not really be affected by our general workload.

Oh it is definitely to do with the work load; the pytest suite only took < 12 minutes to run for all other jobs in the matrix, but the failing job got killed at 31 minutes: https://github.com/aiidateam/aiida-core/runs/1803637702?check_suite_focus=true. I have had this happen numerous times in the last few weeks.

No problem to increase the timeout, but I would also suggest for this workflow, you can now drop the python 3.6 jobs, inline with #4691

As a related note, we could host our own runner which might alleviate the situation.

I was looking into that for replacing Jenkins, but it has a number of security implications, and in the documentation it basically suggests to only use your own runner on private repositories

@csadorf
Copy link
Contributor Author

csadorf commented Feb 1, 2021

which should not really be affected by our general workload.

Oh it is definitely to do with the work load; the pytest suite only took < 12 minutes to run for all other jobs in the matrix, but the failing job got killed at 31 minutes: https://github.com/aiidateam/aiida-core/runs/1803637702?check_suite_focus=true. I have had this happen numerous times in the last few weeks.

What I mean is that the runners reserved for this repository should be exclusively used for the scheduled test-install workflow in the middle night and it shouldn't be affected by any other CI work.

No problem to increase the timeout, but I would also suggest for this workflow, you can now drop the python 3.6 jobs, inline with #4691

I am planning on tackling that as part of the upcoming coding week.

As a related note, we could host our own runner which might alleviate the situation.

I was looking into that for replacing Jenkins, but it has a number of security implications, and in the documentation it basically suggests to only use your own runner on private repositories

Oh wow, that's really bad. I hope that they will work on that.

@csadorf csadorf marked this pull request as ready for review February 1, 2021 15:12
@csadorf csadorf requested a review from chrisjsewell February 1, 2021 15:12
Copy link
Member

@chrisjsewell chrisjsewell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me know when you have fixed the pre-commit 😄

@chrisjsewell
Copy link
Member

aiida/tools/graph/graph_traversers.py:212: error: Incompatible types in assignment (expression has type "float", variable has type "Optional[int]") [assignment]
Found 1 error in 1 file (checked 55 source files)

Started seeing this in other PRs as well (cc @ltalirz), possible dependency version change?

@chrisjsewell
Copy link
Member

It appears numpy.inf has changed from being identified as an int to a float 🤷

@chrisjsewell
Copy link
Member

chrisjsewell commented Feb 1, 2021

Fixing in #4700

@ltalirz
Copy link
Member

ltalirz commented Feb 1, 2021

yes, numpy 1.20 has been released https://numpy.org/doc/1.20/release/1.20.0-notes.html

already broke pymatgen ;-)

@chrisjsewell
Copy link
Member

already broke pymatgen ;-)

No surprise there 😆

@csadorf
Copy link
Contributor Author

csadorf commented Feb 12, 2021

I believe this problem is actually related to #4731 which I would like to prioritize to resolve.

@csadorf csadorf closed this Feb 12, 2021
@chrisjsewell chrisjsewell deleted the ci/increase-ga-test-install-tests-timeout branch April 27, 2021 11:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants