Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider rules in INJECT/REPLICATING state for pileup location at MSTransferor level #9969

Closed
amaltaro opened this issue Oct 8, 2020 · 5 comments · Fixed by #10041
Closed

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Oct 8, 2020

Impact of the new feature
ReqMgr2MS - MSTransferor

Is your feature request related to a problem? Please describe.
To avoid over-replicating the pileup dataset - in case the campaign configuration allows it - we should consider those pileup container rules sitting in state INJECT (rule created but not yet considered by Rucio) and REPLICATING as valid location for the pileup container.

Describe the solution you'd like
The current MSTransferor logic is here:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/MicroService/Tools/PycurlRucio.py#L137

and we should add those RSEs from rules in state INJECT or REPLICATING (any other state to be considered?), of course we should log such cases as well.

Besides accepting those as data location, we would have to keep their rule id such that MSMonitor can keep evaluating it and only release the workflow once that rule has been satisfied. Passing this rule id from listReplicationRules function all the way down to where transferIDs are persisted is going to be a challenge...

Describe alternatives you've considered
If we could implement it assuming MSTransferor will only run in Rucio mode, it would be much simpler...

Additional context
none

@klannon
Copy link

klannon commented Oct 8, 2020

Why wouldn't we assume Rucio only. PhEDEx should be irrelevant in 2 weeks or less, right?

@amaltaro
Copy link
Contributor Author

amaltaro commented Oct 8, 2020

I hope so. However, I fear that we might find a BIG problem which would make us to switch back to PhEDEx. If you think I'm too paranoic, I'd be very pleased to implement this considering only Rucio ;)

@klannon
Copy link

klannon commented Oct 8, 2020

Well, if implementing it Rucio only gets it done quicker and makes it available sooner, then that seems like a win. If something really bad happens, then you just need to go back and add something to cover PhEDEx, but even that would be just a temporary situation. I'd say let's plan for success, but be prepared to act if a problem comes up.

@amaltaro
Copy link
Contributor Author

Just an update here, I just noticed we already consider rules in INJECT/REPLICATING state. However, we do not keep the rule_id and persist it in the transfer document (to allow MSMonitor to monitor its progress).

So, what needs to be provided here is a way to propagate such rule_ids through the MSTransferor code such that it can be persisted in the transfer document, which isn't straight forward, but can be done...

@amaltaro
Copy link
Contributor Author

After having a closer look at the code, this will add excessive complication to the code at its current state. We anyways won't release GQE unless the pileup has container level rules in state OK. The only difference is that we no longer keep track of the pileup transfer progress (in the transfer doc), but its location is updated every x hours in global workqueue.

To implement what is requested in this GH issue, significant changes must be made on the MSTransferor logic considering the Rucio model for input data placement (while the current logic is mostly PhEDEx driven).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants