-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clean input data rules #10224
Clean input data rules #10224
Conversation
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todor, even though this PR is in an initial phase, I have made a couple of comments to be considered along the code.
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
7174d9d
to
d052a2a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todor, as we discussed over slack, I'm concerned with the apparent complexity of the workflow data structure object and how complex it became to parse it. I have made a few comments along the code for your consideration. I'll try to provide you with another suggestion (source code) on how that could be accomplish in a cleaner and more maintainable way, at least from my perspective.
src/python/WMCore/MicroService/DataStructs/MSRuleCleanerWflow.py
Outdated
Show resolved
Hide resolved
src/python/WMCore/MicroService/DataStructs/MSRuleCleanerWflow.py
Outdated
Show resolved
Hide resolved
src/python/WMCore/MicroService/DataStructs/MSRuleCleanerWflow.py
Outdated
Show resolved
Hide resolved
Jenkins results:
|
src/python/WMCore/MicroService/DataStructs/MSRuleCleanerWflow.py
Outdated
Show resolved
Hide resolved
Todor, I failed to find our discussion regarding the workflow parsing, so I make my comment here. Yes, the current code looks better IMO. There were other useful comments there that might have been lost though, and further improvements could be done (like, not traversing the workflow description for keys that belong only to the MSRuleCleaner object). Please make representative unit tests to make sure it will work as expected. |
Thanks for commenting @amaltaro , I am working on those useful comments in this very moment. I will continue developing on top of the so proposed change then. |
@amaltaro Alan, I think I addressed all your comments. Please take a look. I expect some unit tests to break and I am about to deal with them in the meantime. |
Jenkins results:
|
43fe9cc
to
e52ac38
Compare
Jenkins results:
|
Jenkins results:
|
Jenkins results:
|
0ec64d3
to
a3e1e6b
Compare
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todor, I have made some more comments along the code.
Regarding MSRuleCleanerWflow module, I have not yet fully reviewed it because I'd like to see first a few unit tests testing those method's functionality. The two already existent are certainly not enough. I'd advice to make method-based unit tests as well, in addition to those testing the whole parsing in one shot.
@todor-ivanov let me try to be very specific on what I think we should have implemented in the unit tests here. So far we only have 2 tests for the MSRuleCleanerWflow module:
if we want to make sure the workflow parsing code works as expected, we should test at least a few more use cases, such as:
can you please implement those tests? Please avoid using those json templates, because whenever we change them, we have to update the test code as well. Instead, simply use the relevalent key/value pairs to exercise that code. |
('RequestTransition', [], list)] | ||
('RequestTransition', [], list), | ||
('IncludeParents', False, bool), | ||
('DataPileup', None, (str, unicode)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And I just noticed these default values don't match the default values in the unit test testTaskChainDefaults
.
DataPileup, MCPileup and ParentDataset are inconsistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I have not yet started paying attention to the final version of the unit tests. I was about to do that in a later stage. Once we agree on the implementation of all the rest. Because we kind of depend on that for the unit tests step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the DataPileup/MCPileup/ParentDataset default value supposed to be a list instead of None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think everything is as it should be, you just forget about this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's precisely my point. This data structure defines the schema of the workflow object used throughout MSRuleCleaner. By just looking at this, I'd expect those keys to have a value equal to None
, not an empty list.
Does the whole logic - besides that explicit code you just referred to - change if you set default value to []
instead of None? If not and we get that for free, then I'd suggest to update the default value to make it consistent with the outcome of this workflow parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should work, but what would happen is that this conversion from a string to list will happen silently in the workflow parser class here: [1] instead of explicitly in the three lines I previously mentioned. I have tested it both ways, and now I do not remember which exactly but there was a corner case in which the silent conversion of the data structure was not preferable. So I preferred in the -wfParser()
method to make the best effort to revert the structure back to the original one as described in the document template (which reflects the values as we expect them from reqmgr), and only then make the explicit conversion to lists only for those I really want and make them obvious in the init()
method of MSRuleCleanerWflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I just tested it again.... And not even a corner case but a rather general one. It tries to create a list of all letters present in all the MCPilup dataset names it can find and explodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let it be then. Thanks for checking this out Todor.
src/python/WMCore/MicroService/DataStructs/MSRuleCleanerWflow.py
Outdated
Show resolved
Hide resolved
c42aaa6
to
a71ba1b
Compare
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todor, I have made some further comments. I guess they are all minor and I think you can proceed and finish this PR off.
5276ba3
to
d61185e
Compare
Jenkins results:
|
Jenkins results:
|
d61185e
to
19266a7
Compare
Jenkins results:
|
19266a7
to
a8db39d
Compare
Jenkins results:
|
Jenkins results:
|
Hi @amaltaro, Is there anything else you thing needs to be taken care of regarding this PR? |
Hi Todor, I haven't seen another review request, so I assumed you were still working on this PR. I will have another look into it soon then, meanwhile, can you please update the PR description and create those deployment changes everywhere it's needed? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Todor, I think this code is almost ready to go. I made two comments in the code though which I think we need to update the code for.
('RequestTransition', [], list)] | ||
('RequestTransition', [], list), | ||
('IncludeParents', False, bool), | ||
('DataPileup', None, (str, unicode)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the DataPileup/MCPileup/ParentDataset default value supposed to be a list instead of None?
Jenkins results:
|
Fetch globalLocks from wmstatsServer and reqmgr. Add more fileds to MSRuleCleanerWflow && Recursively parsing the workflow description. Filter repeated values for str objects in wfParser. Include ParentDataset field && Convert back to lists && Extend getRucioRules logic to include input datasets. Resolve parents && Execute plineMSTrBlock Replace closure with class in MSRuleclanerWorkflow. Code review changes: Private methods for globalLocks && changes in data structures at MSRulecleanerWorkflow Clean FIXME comments. Code review changes. Suggested Rucioacct to contList map. Suggested Rucioacct to contList map 2. Introduce Resolve Parent Exception. Handle Resolve Parent Exception. Code review changes. Code review changes. Code review changes. Handle None value returned per dataset from findParents().
Unit tests Unit tests
b841790
to
362f576
Compare
Jenkins results:
|
wflow['ParentDataset'] = [parentDataset[childDataset]] | ||
msg = "Found parent %s for input dataset %s in workflow: %s " | ||
self.logger.info(msg, parentDataset, wflow['InputDataset'], wflow['RequestName']) | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This else block is no longer needed. But that's a minor and I won't ask you to update this code once again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me. Thanks
@todor-ivanov Todor, given that this PR was merged without the relevant deployment changes in place. I would really appreciate if you can update this PR description pointing to all the deployment/configuration related changes in other repositories. Thanks |
Hi @amaltaro, You have them. |
Fixes #10017
Status
Ready
Description
Fetches the global and parent locks lists from both WMStatsServer and ReqMgr and implements the rest of the logic explained in the issue itself
Is it backward compatible (if not, which system it affects?)
YES
Related PRs
dmwm/deployment#1035
https://gitlab.cern.ch/cmsweb-k8s/services_config/-/merge_requests/72
https://gitlab.cern.ch/cmsweb-k8s/services_config/-/merge_requests/73
External dependencies / deployment changes
No