Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSOutput: support custodial/tape data placement #9911

Merged
merged 1 commit into from
Sep 13, 2020

Conversation

amaltaro
Copy link
Contributor

@amaltaro amaltaro commented Sep 12, 2020

Fixes #9842

Status

ready

Description

This PR makes MSOutput able to also perform the custodial (tape) data placement. Features in place are:

  • tape dataset placement supports PhEDEx and Rucio;
  • dataset placement is skipped if the tier is blacklisted in the MicroService configuration;
  • dataset placement is skipped for tiers blacklisted in the Unified configuration;
  • transfer request / rule creation is always made against a single RSE/node (with a single copy)
    • the same RSE endpoint gets used by the other datasets within the same workflow
  • a workflow is only completely done once all the Disk + Tape placement has succeeded
  • dry-run mode supported as well;
  • service summary report unified (data structures were the same anyways);
  • update the local caches for both MSOutput threads (Consumer and Producer);
  • removed "&cms_type=real&rse_type=DISK" from the RelVal RSE expression. We always set the specific RSE name anyways, so that expression won't make any difference;
  • FIXME: tape destination placeholder created (and hardcoded to FNAL). To be fixed in MSOutput: decide which custodial site to use for data placement #9841

Is it backward compatible (if not, which system it affects?)

no, new feature

Related PRs

none

External dependencies / deployment changes

none

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 10 warnings
    • 16 comments to review
  • Pycodestyle check: succeeded
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10423/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 10 warnings
    • 16 comments to review
  • Pycodestyle check: succeeded
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10424/artifact/artifacts/PullRequestReport.html

@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 4 warnings and errors that must be fixed
    • 11 warnings
    • 19 comments to review
  • Pycodestyle check: succeeded
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10425/artifact/artifacts/PullRequestReport.html

assign tape RSE before looping over output datasets

use custodial flag for tape subscriptions; exact RSEs for RelVals

fix logging
@cmsdmwmbot
Copy link

Jenkins results:

  • Unit tests: succeeded
  • Pylint check: failed
    • 3 warnings and errors that must be fixed
    • 11 warnings
    • 19 comments to review
  • Pycodestyle check: succeeded
  • Python3 compatibility checks: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/10426/artifact/artifacts/PullRequestReport.html

@amaltaro
Copy link
Contributor Author

Basic tests done, including some real call against DDM and PhEDEx (rucio account needs to be sorted). So, it's good enough to go.

@amaltaro amaltaro merged commit 02c4af2 into dmwm:master Sep 13, 2020
Copy link

@nsmith- nsmith- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a look and it makes sense to me

@@ -301,11 +288,8 @@ def makeSubscriptions(self, workflow):

# overwrite the current DiskDestination by something that will work for DDM
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So unlike Unified, the output disk site choice is resolved here? Because as I understand, Unified just forwards the request to dynamo (with a generic or null list of sites) in https://github.com/CMSCompOps/WmAgentScripts/blob/master/Unified/closor.py#L585 (perhaps @dr-stringfellow can confirm) If so, we should review how the disk site is chosen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the disk destination is decided in this block:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/Unified/MSOutput.py#L755-L791
but noticed that the destination is already in the format of an RSE Expression (Rucio compliant).

In this section of the code, we are making data transfers against DDM, so we need to massage a bit the list of sites (for RelVals, it does not change). For standard workflows, we simply pass T1 Disk and T2s, and leave the final decision to DDM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MSOutput: perform custodial data placement
3 participants