Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CreationFailure exit code #9533

Merged
merged 1 commit into from
Jun 17, 2020
Merged

Add CreationFailure exit code #9533

merged 1 commit into from
Jun 17, 2020

Conversation

goughes
Copy link
Contributor

@goughes goughes commented Feb 6, 2020

Fixes #9531

Status

Not tested

Description

Add a missing exit code to WMExceptions.py:

99305: "Found single input file with too many events to be processed in a pilot lifetime"

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

NA

External dependencies / deployment changes

NA

@goughes
Copy link
Contributor Author

goughes commented Feb 6, 2020

@amaltaro Any need to update the defaultMsg here to reflect the new message I put in WMExceptions?

defaultMsg = "There is a condition which assures that this job will fail if it's submitted"

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: succeeded
    • 3 comments to review
  • Pycodestyle check: succeeded
    • 8 comments to review
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/9733/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

amaltaro commented Feb 6, 2020

Interesting, I'm curious to know what that condition would be. Could you please try to find where it's supposed to get triggered? Otherwise it makes sense to me to also update that default msg.

@goughes
Copy link
Contributor Author

goughes commented Jun 17, 2020

@amaltaro I looked around for a case where failedReason would not be set when creating a failed job. No guarantees my search was exhaustive, but 30 mins seemed like enough time to spend tracing where this can be set.

Hence, I updated defaultMsg to reflect the error message from the exception.

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: failed
    • 1 new failures
  • Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 4 warnings
    • 11 comments to review
  • Pycodestyle check: succeeded
    • 28 comments to review
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10131/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 1 warnings and errors that must be fixed
    • 4 warnings
    • 11 comments to review
  • Pycodestyle check: succeeded
    • 28 comments to review
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10132/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor

Yes, it looks like failedReason is always present on those jobs that failed to be created:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/JobSplitting/JobFactory.py#L139

Erik, please request a review via GH Reviewers once you're happy with your changes.

Copy link
Contributor

@todor-ivanov todor-ivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me

Copy link
Contributor

@amaltaro amaltaro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Erik! Final test will come before we release the next production WMA release.

@amaltaro amaltaro merged commit 6db0176 into dmwm:master Jun 17, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add CreationFailure exit code to the list of exceptions
4 participants