Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip ErrorHandler cycle if agent configuration fails to be retrieved #10817

Merged
merged 1 commit into from
Sep 23, 2021

Conversation

amaltaro
Copy link
Contributor

Fixes #10778

Status

ready

Description

If maxRetries is not in the format of a dictionary, that means the component failed to retrieve the agent configuration from central couch reqmgr_auxiliary db. In such cases, which have been sporadically happening, we should skip the component cycle to avoid a comparison of different data types (which is no longer a valid case in Python3, and an exception is raised).

We still need to think about an alarm system for all these components, and which conditions to trigger them, but that is for the future in their own GH issues.

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

None

External dependencies / deployment changes

None

@cmsdmwmbot
Copy link

Jenkins results:

  • Python2 Unit tests: failed
    • 5 new failures
    • 1 changes in unstable tests
  • Python3 Unit tests: failed
    • 5 new failures
    • 2 changes in unstable tests
  • Python2 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 1 warnings
    • 26 comments to review
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 1 warnings
    • 30 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 16 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12469/artifact/artifacts/PullRequestReport.html

add back setupComponentParam to the init method
@cmsdmwmbot
Copy link

Jenkins results:

  • Python2 Unit tests: succeeded
  • Python3 Unit tests: succeeded
    • 1 changes in unstable tests
  • Python2 Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 1 warnings
    • 26 comments to review
  • Python3 Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 1 warnings
    • 30 comments to review
  • Pylint py3k check: succeeded
  • Pycodestyle check: succeeded
    • 16 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/12470/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor Author

I'm merging it such that it goes in a new tag, but comments are still welcome!

@amaltaro amaltaro merged commit 9767206 into dmwm:master Sep 23, 2021
Copy link
Contributor

@todor-ivanov todor-ivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ErrorHandler crash when it fails to pull the agent configuration from reqmgr_aux
3 participants