Update PileupFetcher to use Rucio to build the list of replicas and block location #9894

amaltaro · 2020-09-07T08:33:20Z

Impact of the new feature
WMAgent

Is your feature request related to a problem? Please describe.
PileupFetcher uses PhEDEx to build a dictionary of:

blocks and their current location
blocks and a list of all their files/replicas

as we decommission PhEDEx, the same functionality needs to be ported to Rucio.

Describe the solution you'd like
Support Rucio within PileupFetcher, such that the pileup dataset location can be performed through Rucio.
This means, we need to implement:

given a container name, return all its blocks and their current location (where data available and locked by WMCore)
given a container name, return all its blocks and their files/replicas

The piece of code building the list of files need to be reviewed though. We fetch the list from DBS, which then gets overwritten by the list of files retrieved from PhEDEx (likely taking care of the invalid/no-longer-existent files).
It's also not clear to me why we need to know the amount of events in each block..

Describe alternatives you've considered
Not resolve the list of files in Rucio, rely on that information from DBS (and maybe filter by validOnly).

Additional context
none

The text was updated successfully, but these errors were encountered:

amaltaro · 2020-09-22T06:08:48Z

I'm starting to think that it would be easier (and compatible with our wishes for the future) to resolve the pileup location differently than how we resolve all the other data used in central production.

So, instead of

listing all the replication rules for the container and all its blocks (list_replication_rules)
resolving the rse expressions
resolving the multi RSE rules with single copy
and finally listing the final dataset locks (and compare them against the rules), thus get_dataset_locks

I think we could simply resolve the current block location (no matter who is responsible for locking data) with a list_dataset_replicas_bulk, as done by:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WMSpec/Steps/Fetchers/PileupFetcher.py#L96

I guess pileup won't be placed by anyone than either the DM team, together with CompOps, or MSTransferor (well, WMAgent as well, from the block-level locks when data is getting produced).

@nsmith- would you have any preference here? Or perhaps you can see a reason not to do something that I fail to?

amaltaro · 2020-09-22T14:15:22Z

Actually, I implemented the first option (checking locks and availability).

nsmith- · 2020-09-22T18:17:08Z

I think at some point in the future we would like to not have any manual placement of pileup required. In that sense, probably the original option is better long-term.

amaltaro · 2020-09-22T18:21:25Z

@nsmith- sorry Nick, are you referring a simple block location lookup as the original option? Or the one which considers locks made by the transferor account?

amaltaro · 2020-09-23T07:37:59Z

If we do not want any manual placement, I understand that means MSTransferor would be in charge of the pileup data placement as well. Given that the PR has been tested, I'm proceeding with the lock-based approach. I'm happy to change it in the near future though if we find it wasn't the best choice. Thanks

nsmith- · 2020-09-23T13:57:54Z

Yeah I mean lock-based, sorry it was unclear from the comment. I think you have what is best already in the code then.

amaltaro added New Feature High Priority Rucio WMAgent labels Sep 7, 2020

amaltaro added this to the September_2020 milestone Sep 7, 2020

amaltaro self-assigned this Sep 7, 2020

amaltaro mentioned this issue Sep 22, 2020

Construct the PU location via rucio considering MSTransferor locks #9930

Merged

amaltaro closed this as completed in #9930 Sep 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update PileupFetcher to use Rucio to build the list of replicas and block location #9894

Update PileupFetcher to use Rucio to build the list of replicas and block location #9894

amaltaro commented Sep 7, 2020 •

edited

Loading

amaltaro commented Sep 22, 2020

amaltaro commented Sep 22, 2020

nsmith- commented Sep 22, 2020

amaltaro commented Sep 22, 2020

amaltaro commented Sep 23, 2020

nsmith- commented Sep 23, 2020

Update PileupFetcher to use Rucio to build the list of replicas and block location #9894

Update PileupFetcher to use Rucio to build the list of replicas and block location #9894

Comments

amaltaro commented Sep 7, 2020 • edited Loading

amaltaro commented Sep 22, 2020

amaltaro commented Sep 22, 2020

nsmith- commented Sep 22, 2020

amaltaro commented Sep 22, 2020

amaltaro commented Sep 23, 2020

nsmith- commented Sep 23, 2020

amaltaro commented Sep 7, 2020 •

edited

Loading