-
Notifications
You must be signed in to change notification settings - Fork 108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enforce dataset_lifetime column to be integer and not null #11115
Conversation
Jenkins results:
|
Hi @amaltaro, Alan please consider the following changes in this PR in order to fix an issue in the deployment of a T0 agent during the creation of the oracle tables as can been seen below
I deployed a replay with WMCore 2.0.2.patch2 + PR11115 + Commit63d99e and the replay finished without issues. I checked the container rules created by T0 for subscriptions to disk and the rules have the proper lifetime. |
'phedex_group': phedex_group, | ||
'delete_blocks': delete_blocks, | ||
'dataset_lifetime': subscriptionInfo['DatasetLifetime']}) | ||
bind = {'id': datasetID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be more protective if you explicitly cast data-types for each binding value, e.g.
bind = {'id': int(datasetID), 'site': str(site), ...}
Doing this way you'll ensure that proper data-type is passed to bind dictionary which you later pass to ORACLE DB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good suggestion indeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm updating this code with a few more safe checks for None. But these suggestions might actually cause more problems than resolve them. For instance, casting None to Integer will crash, casting a byte string to string will make things even worse.
In addition to that, dataset_id
is already protected by the database schema and this code is in use for many years. Different than dataset_lifetime
, which is a new feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way how you implemented those additional checks do look sound to me. Thanks @amaltaro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing in it the proper way @amaltaro. As of the current implementation I basically did second the previous two comments. But I read the PR status as not-tested
so I believe you would have those fixed during testing anyway.
'phedex_group': phedex_group, | ||
'delete_blocks': delete_blocks, | ||
'dataset_lifetime': subscriptionInfo['DatasetLifetime']}) | ||
bind = {'id': datasetID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good suggestion indeed.
@@ -49,7 +49,7 @@ def __init__(self, logger = None, dbi = None, params = None): | |||
subscribed INTEGER DEFAULT 0, | |||
phedex_group VARCHAR(100), | |||
delete_blocks INTEGER, | |||
dataset_lifetime INTEGER DEFAULT 0, | |||
dataset_lifetime INTEGER NOT NULL DEFAULT 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@amaltaro I think @jhonatanamado's comment needs to be addressed here too.
@jhonatanamado thank you very much for testing this in a replay! I fixed the order of those constraints in my second commit. It also has a fix further data type checks, as requested by Valentin and Todor. Please have another look at it. |
Jenkins results:
|
I confirm that Oracle-based Jenkins tests are happier now, while the previous order of constraints was causing 92 new tests to fail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the follwo up @amaltaro !
The code looks good.
'phedex_group': phedex_group, | ||
'delete_blocks': delete_blocks, | ||
'dataset_lifetime': subscriptionInfo['DatasetLifetime']}) | ||
bind = {'id': datasetID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way how you implemented those additional checks do look sound to me. Thanks @amaltaro
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you want to fix it right, please add validation to all passed values, I provided concrete examples to each individual columns.
'delete_blocks': delete_blocks, | ||
'dataset_lifetime': subscriptionInfo['DatasetLifetime']}) | ||
bind = {'id': datasetID, | ||
'site': site, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you protect site to satisfy to correct regexp. The schema only require it to be not null, so someone can pass "ABC" and it will be fine, but this is not what our sites are. Please put protection to validate this input value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already validated upstream, at the request spec level, e.g.:
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WMSpec/StdSpecs/StdBase.py#L1135
'dataset_lifetime': subscriptionInfo['DatasetLifetime']}) | ||
bind = {'id': datasetID, | ||
'site': site, | ||
'custodial': 0 if custodialFlag is None else custodialFlag, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
custodialFlag here can be passed as -1
, it is still and integer but I doubt it is correct value, please validate it appropriately to fall to a specific range, e.g. be only 1 or 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like the custodialFlag
is actually either True or False, as defined in this very same code. Unsure how was/is getting casted to integer though...
'site': site, | ||
'custodial': 0 if custodialFlag is None else custodialFlag, | ||
'auto_approve': 1 if site in subscriptionInfo['AutoApproveSites'] else 0, | ||
'move': 0 if isMove is None else isMove, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isMove requires validation too, someone can pass -1
and code will not spot it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm actually removing this check on isMove
because it's already validated a few lines above.
'custodial': 0 if custodialFlag is None else custodialFlag, | ||
'auto_approve': 1 if site in subscriptionInfo['AutoApproveSites'] else 0, | ||
'move': 0 if isMove is None else isMove, | ||
'priority': subscriptionInfo['Priority'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you make proper default here, e.g. subscriptionInfo.get('Priority', "some-default-value")
. And, the value should be validated too to satisfy allowed values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Request priority default comes from the upper spec file, here (and data type is also validated in there):
https://github.com/dmwm/WMCore/blob/master/src/python/WMCore/WMSpec/StdSpecs/StdBase.py#L1025
Having default values defined in multiple places is just source for a future error.
'auto_approve': 1 if site in subscriptionInfo['AutoApproveSites'] else 0, | ||
'move': 0 if isMove is None else isMove, | ||
'priority': subscriptionInfo['Priority'], | ||
'phedex_group': phedex_group, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if you really need it anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it's no longer relevant. Actually, a good chunk of this is no longer needed and we need to re-purpose it according to the Rucio convention. To be addressed here: #9639
'move': 0 if isMove is None else isMove, | ||
'priority': subscriptionInfo['Priority'], | ||
'phedex_group': phedex_group, | ||
'delete_blocks': delete_blocks} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
requires validation too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already covered a few lines above and in accordance with the database schema. I could be wrong, but AFAIK it's more performant to have a column with None/null than with an integer 0. So I'm in favor of actually not changing this 1 or None.
@vkuznet in addition to the follow up comments I made along the code, I also pushed another commit with further changes. I will squash it once review is over. Please have another look at your convenience, but keep in mind that much of this actually comes from upstream - request level - and most of that is already validated and data type is checked as well. I'd like to avoid adding multiple layers of data validation whenever possible. |
Jenkins results:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you mentioned several times that some values are validated upstream please put appropriate comment to the code about these checks. It would be useful to know why some values you check and others do not.
I understand your request, but bear in mind that EVERY single workflow parameter is actually validated by WMSpec/StdSpecs specs. If we do so, then we should do it for 100s of files that are using/setting workflow related parameters. Instead, just give it some time and you will get to know what those parameters are, but in short everything is defined in this super class: |
BTW, these latest changes have also been tested with oracle jenkins. |
ensure move and custodial are not None; fix order of db constraints address some of Valentins concerns add docstring and extra comment for the param/validation
@vkuznet Valentin, I just added a docstring string and further comments on the parameters/validation. However, as stated above, every workflow-related information that goes to the relational database is validated beforehand and IMO adding such comments to 1 DAO out of hundreds won't be of much help. Please have another look at your convenience. |
Jenkins results:
|
Thank you Valentin and Todor. Given that it doesn't change anything in the central services, I'm merging it now. |
Fixes #11108
Alternative to #11111
Status
ready
Description
This change will ensure that:
None
as input for thedataset_lifetime
columnIs it backward compatible (if not, which system it affects?)
NO (database schema change)
Related PRs
If this proposal is acceptable and correct, we might want to merge this in master only; backport and merge #11111 to the 2.0.2_wmagent branch and create a new patch release.
External dependencies / deployment changes
None