Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import DBS3SetDatasetStatus and DBS3SetFileStatus into CRAB #5204

Closed
3 of 5 tasks
belforte opened this issue Jul 13, 2023 · 8 comments
Closed
3 of 5 tasks

import DBS3SetDatasetStatus and DBS3SetFileStatus into CRAB #5204

belforte opened this issue Jul 13, 2023 · 8 comments

Comments

@belforte
Copy link
Member

belforte commented Jul 13, 2023

@belforte
Copy link
Member Author

@novicecpp @mapellidario while I am the default volunteer/victim here, this is something that you should be able to do as well. Feel free to pick it up in case (change the assignee) and then feel free to ask me questions !

belforte added a commit to belforte/CRABClient that referenced this issue Oct 23, 2023
@belforte
Copy link
Member Author

belforte commented Oct 23, 2023

crab setdataset is done. I will look at setfiles before worrying about the --recursive option which I am not sure is that useful for USER datasets.


belforte@lxplus805/TC3> crab setdataset --dataset /GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER --status DELETED            
looking up Dataset /GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER in DBS prod/phys03
Dataset status in DBS is VALID
Will set it to DELETED
Dataset status changed successfully
Dataset status in DBS now is DELETED
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> 

belforte@lxplus805/TC3> crab setdataset --dataset /GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER --status VALID   
looking up Dataset /GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER in DBS prod/phys03
Dataset status in DBS is DELETED
Will set it to VALID
Dataset status changed successfully
Dataset status in DBS now is VALID
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> 

@belforte
Copy link
Member Author

initial version of crab setfiles is also ready. I only takes a single LFN to act upon. Next I will extend to a list or file with a list or all files in a dataset. The latter is bit fuzzy since it looks like old DBS code by Yuyi allows to pass a block https://github.com/dmwm/DBS/blob/14df8bbe8ee8f874fe423399b18afef911fe78c7/Client/utils/DataOpsScripts/DBS3SetFileStatus.py#L151 while new server only takes LFN(s) or dataset https://github.com/dmwm/DBSClient/blob/df14cab23662f8d8b5f072857d937b4fadd2885b/src/python/dbs/apis/dbsClient.py#L1975

Examples from my branch https://github.com/belforte/CRABClient/tree/add-change-DBS-dataset-file-status-fix-5204

belforte@lxplus805/TC3> dasgoclient --query 'file file=/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root instance=prod/phys03 |grep file.is_file_valid'
1  
belforte@lxplus805/TC3> crab setfiles --lfn /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root --status INVALID
looking up LFN /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root in DBS prod/phys03
File status in DBS is VALID
Will set it to INVALID
Dataset status changed successfully
LFN status in DBS now is INVALID
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> dasgoclient --query 'file file=/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root instance=prod/phys03 |grep file.is_file_valid'
0  
belforte@lxplus805/TC3> crab setfiles --lfn /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root --status VALID  
looking up LFN /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root in DBS prod/phys03
File status in DBS is INVALID
Will set it to VALID
Dataset status changed successfully
LFN status in DBS now is VALID
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> dasgoclient --query 'file file=/store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root instance=prod/phys03 |grep file.is_file_valid'
1  
belforte@lxplus805/TC3> crab setfiles --lfn /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root --status VALID
looking up LFN /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_1.root in DBS prod/phys03
File status in DBS is VALID
Will set it to VALID
Dataset status changed successfully
LFN status in DBS now is VALID
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> 

@belforte
Copy link
Member Author

belforte commented Oct 24, 2023

I got parameter manipulation under control to replicate the --files option from https://github.com/dmwm/DBS/blob/master/Client/utils/DataOpsScripts/DBS3SetFileStatus.py
But exploration with passing a list of files to the DBS REST server ran into odd issues. I have asked DBS experts (Valentin) dmwm/dbs2go#102
Valenting confirmed a bug on server side. But he is not maintaining dbs2go code anymore. Let's see.

@belforte
Copy link
Member Author

should change command names to:

@belforte belforte added Done and removed In Progress labels Oct 25, 2023
@belforte
Copy link
Member Author

From my side code changes are completed, pending review of #5241 and full validation for "nothing broke"

Now crab setfilestatus prints summary of status before/after the change is applied.
When indicating a full dataset:

belforte@lxplus805/TC3> crab setfilestatus --status VALID --dataset /GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER
Dataset file count total/valid/invalid = 40/1/39
File(s) status changed successfully
Dataset file count total/valid/invalid = 40/40/0
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> 

or when indicating a single file (I use same code/format):

belforte@lxplus805/TC3> crab setfilestatus --file /store/user/rucio/belforte/testRucioPub/GenericTTbar/Stefano-TestRucioP-230817/230817_203741/0000/kk_15.root --status INVALID
LFN to be changed belongs to dataset /GenericTTbar/belforte-Stefano-TestRucioP-230817-94ba0e06145abd65ccb1d21786dc7e1d/USER
Dataset file count total/valid/invalid = 40/40/0
File(s) status changed successfully
Dataset file count total/valid/invalid = 40/39/1
Log file is /afs/cern.ch/work/b/belforte/CRAB3/TC3/crab.log
belforte@lxplus805/TC3> 

belforte added a commit that referenced this issue Oct 26, 2023
* add setdataset.py for #5204

* refactor and add setfiles

* add Content-type arg to HTTPRequests

* setdataset to use contentType

* rename commands to setdatasetstatus setfilestatus

* add autocomplete

* list of LFNs not supported yet

* some pylint and pep8

* add logging for setfilestatus
@belforte
Copy link
Member Author

belforte commented Oct 26, 2023

actions from Wa's review in #5241 (review)

  • Move HTTPRequests to utils and move DBSREST to new file same level as CrabRestInterface.py
    • ClientUtils is already too long. I am creating a new file RestInterfaces.py to contain HTTPRequest, CRABRest and DBSRest
  • look a the feasibility to rename uri argument in HTTPRequests to api
    • this requires some work. There's too much history in HTTPRequests class, the best way is to create a new class like CRABRest with its own get/put/post... methods which take api as arg and subclass it as DBSRestReader and DBSRestWriter. Not sure it is worth. And of course leave it to another time good cleanup of HTTPRequests solving the schizophrenic use of hostname and the horrible way to use self as a dictionary - Moved to rationalize use of HTTPRequests across Server and Client CRABServer#7986
  • getDbsREST should find the current client version, not ask for it as argument
  • fix usage of Content-type in HTTPRequests
    • maybe I do not need the Content-type=json, now that all my problems were clarified as a dbs2go bug ?
    • Yes. Everything works w/o it. Looking again at DBS2GO documentation and curl documentation (thanks Wa!) I conclude that DBS2GO simply assumes that data is JSON (which explains some of the errors I was having) and that indicating it explictly in the curl is sort of "being nice and very clear". So all in all I will keep this. As:
      • if user of HTTPRequests is not happy with curl's default, they can pass a contentType

belforte added a commit that referenced this issue Oct 26, 2023
* add setdataset.py for #5204

* refactor and add setfiles

* add Content-type arg to HTTPRequests

* setdataset to use contentType

* rename commands to setdatasetstatus setfilestatus

* add autocomplete

* list of LFNs not supported yet

* some pylint and pep8

* add logging for setfilestatus

* more HTTPRequest,CRABRest,getDBSRest to new RestInterfaces.py

* do not pass version to REST clients, it is set in HTTPRequests

* removed one import __version__ too many !

* fix use of version and UserAgent

* fix use of version and UserAgent

* simpley make userAgent=CRABClient/__version__ the default

* cleanup use of Content-type

* cleanup use of Content-type

* add comment
belforte added a commit to belforte/CRABClient that referenced this issue Oct 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant