Combines block records in single dict that includes all associated wfs #11245

germanfgv · 2022-08-05T23:49:16Z

Status

Ready

Description

Turn list of blocks into a dictionary to avoid duplicate registers for single block. Turns 'workflowname' into a set of workflows that generated the block

Is it backward compatible (if not, which system it affects?)

YES

Related PRs

None

External dependencies / deployment changes

None

cmsdmwmbot · 2022-08-06T00:02:24Z

Jenkins results:

Python3 Unit tests: failed
- 1 new failures
- 2 tests no longer failing
- 3 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13487/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-06T00:04:59Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests no longer failing
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13486/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-06T00:07:49Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 tests no longer failing
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13488/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-06T00:11:10Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13489/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-06T00:13:52Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests no longer failing
- 3 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13490/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-06T00:40:17Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 1 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13491/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-08T14:06:15Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests no longer failing
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13504/artifact/artifacts/PullRequestReport.html

amaltaro · 2022-08-08T19:24:34Z

@germanfgv I am not sure I fully undertand the problem you reported. However, looking into this line that you implemented:

if block['workflowNames'].issubset(deletableWfsDict)

it makes me think that, regardless whether we have your changes in place or not, block['workflowNames'] will be a subset of deletableWfsDict if it has only 1 workflow or multiple workflows in the set.

Unless completedWorkflows is always fully built before deletableWorkflows, then in this case yes, it would only evaluate to True once all the completedWorkflows are also deletableWorkflows.

cmsdmwmbot · 2022-08-10T09:24:17Z

Jenkins results:

Python3 Unit tests: succeeded
- 2 tests no longer failing
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13505/artifact/artifacts/PullRequestReport.html

cmsdmwmbot · 2022-08-10T15:56:14Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 tests no longer failing
- 2 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 25 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13506/artifact/artifacts/PullRequestReport.html

germanfgv · 2022-08-12T08:44:39Z

@germanfgv I am not sure I fully undertand the problem you reported. However, looking into this line that you implemented:
if block['workflowNames'].issubset(deletableWfsDict)
it makes me think that, regardless whether we have your changes in place or not, block['workflowNames'] will be a subset of deletableWfsDict if it has only 1 workflow or multiple workflows in the set.

Unless completedWorkflows is always fully built before deletableWorkflows, then in this case yes, it would only evaluate to True once all the completedWorkflows are also deletableWorkflows.

Maybe this concern is addressed with my second commit. Before, we were disregarding incomplete workflows in the query due to HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed).

Because of this, you are right that if block['workflowNames'].issubset(deletableWfsDict) would almost always evaluate true (Almost because deletable and completed are not the same. E.g. a completed repack workflow that cannont be deleted because its output is being used by a promptReco workflow) deletable > completed.

By removing the completion check in the query in 0bf4285, now we can properly avoid deleting blocks that belong to incomplete workflows

todor-ivanov

Hi @germanfgv Thanks for the suggested change.
I am still having some concerns about this extra aggregation that is needed in the python code. This is all a sign that we are ignoring the uniqueness of the records as they come from the CROUP BY SQL statement, which to me seems to be dangerous.
In parallel to the already existing changes, which as far as I understood are already working in one of the T0 machines in production, I'd like to be able to test few of those SQL queries and DAO outcomes myself in one of the machines affected by this bug, if you do not mind.

todor-ivanov · 2022-08-12T08:21:31Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

@@ -49,7 +49,6 @@ class GetCompletedBlocks(DBFormatter):
                      dbsbuffer_dataset_subscription.site,
                      dbsbuffer_workflow.name,
                      dbsbuffer_block.create_time
-             HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed)


Why are you removing this protection here? IIRC this is meant to assure all the workflows associated with a block are in status completed.

That was useful in GetDeletableBlocks.py because that query was not grouping by workflow name. Here, if you exclude records with workflows that are not complete, you will still have other records with other workflows associated to the block, so the block will still end up in the results, even with uncompleted associated workflows.

Even worst, as we are excluding the uncompleted workflows from the query, later we are not taking thos uncompleted workflows into account when we compare against the deletableWfsDict list

because that query was not grouping by workflow name

That is correct.
I have double checked now, that introducing the dbsbuffer_workflow.name in the SELECT and GROUP BY conditions splits the grouping further down, creating multiple groups per block - one for every associated workflow. And then it does count the completed workflows but they enter the query every time a workflow completes..... so if we have 1 block with 3 associated PromptReco requests out of which only two are completed, we will have that block returned 2 times in the output. Once the third Prompt Reco is complete the same block would enter the output result 3 times.

This makes the rest of the logic quite hard to follow indeed.

And since it is quite subtle detail and difficult to understand and remember. Let me put some concrete examples:

Here is the output from the current SQL query with the counts per group included in the output (filtered for a single block):

SQL> SELECT 2 count(*), 3 dbsbuffer_block.blockname, 4 dbsbuffer_location.pnn, 5 dbsbuffer_dataset.path, 6 dbsbuffer_dataset_subscription.site, 7 dbsbuffer_workflow.name, 8 dbsbuffer_block.create_time 9 FROM dbsbuffer_dataset_subscription 10 INNER JOIN dbsbuffer_dataset ON 11 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id 12 INNER JOIN dbsbuffer_block ON 13 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id 14 INNER JOIN dbsbuffer_file ON 15 dbsbuffer_file.block_id = dbsbuffer_block.id 16 INNER JOIN dbsbuffer_workflow ON 17 dbsbuffer_workflow.id = dbsbuffer_file.workflow 18 INNER JOIN dbsbuffer_location ON 19 dbsbuffer_location.id = dbsbuffer_block.location 20 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1 21 AND dbsbuffer_dataset_subscription.subscribed = 1 22 AND dbsbuffer_block.status = 'Closed' 23 AND dbsbuffer_block.deleted = 0 24 AND dbsbuffer_block.blockname = '/ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00' 25 GROUP BY dbsbuffer_block.blockname, 26 dbsbuffer_location.pnn, 27 dbsbuffer_dataset.path, 28 dbsbuffer_dataset_subscription.site, 29 dbsbuffer_workflow.name, 30 dbsbuffer_block.create_time 31 HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed) 32 ORDER BY dbsbuffer_workflow.name; COUNT(*) BLOCKNAME PNN PATH SITE NAME CREATE_TIME ---------- ------------------------------------------------------------------------------------------------------------------------ --------------- ---------------------------------------- ------------------ ---------------------------------------- ----------- 82 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357080_ZeroBias 1660249466 82 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357080_ZeroBias 1660249466

Now lets release the HAVING COUNT(*) = SUM(dbsbuffer_workflow.competed) constraint:

SQL> SELECT 2 count(*), 3 dbsbuffer_block.blockname, 4 dbsbuffer_location.pnn, 5 dbsbuffer_dataset.path, 6 dbsbuffer_dataset_subscription.site, 7 dbsbuffer_workflow.name, 8 dbsbuffer_block.create_time 9 FROM dbsbuffer_dataset_subscription 10 INNER JOIN dbsbuffer_dataset ON 11 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id 12 INNER JOIN dbsbuffer_block ON 13 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id 14 INNER JOIN dbsbuffer_file ON 15 dbsbuffer_file.block_id = dbsbuffer_block.id 16 INNER JOIN dbsbuffer_workflow ON 17 dbsbuffer_workflow.id = dbsbuffer_file.workflow 18 INNER JOIN dbsbuffer_location ON 19 dbsbuffer_location.id = dbsbuffer_block.location 20 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1 21 AND dbsbuffer_dataset_subscription.subscribed = 1 22 AND dbsbuffer_block.status = 'Closed' 23 AND dbsbuffer_block.deleted = 0 24 AND dbsbuffer_block.blockname = '/ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00' 25 GROUP BY dbsbuffer_block.blockname, 26 dbsbuffer_location.pnn, 27 dbsbuffer_dataset.path, 28 dbsbuffer_dataset_subscription.site, 29 dbsbuffer_workflow.name, 30 dbsbuffer_block.create_time 31 ORDER BY dbsbuffer_workflow.name; COUNT(*) BLOCKNAME PNN PATH SITE NAME CREATE_TIME ---------- ------------------------------------------------------------------------------------------------------------------------ --------------- ---------------------------------------- ------------------ ---------------------------------------- ----------- 82 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357080_ZeroBias 1660249466 82 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357080_ZeroBias 1660249466 77 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357081_ZeroBias 1660249466 77 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357081_ZeroBias 1660249466 2 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357096_ZeroBias 1660249466 2 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357096_ZeroBias 1660249466 8 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357098_ZeroBias 1660249466 8 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357098_ZeroBias 1660249466 7 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357100_ZeroBias 1660249466 7 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357100_ZeroBias 1660249466 13 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357101_ZeroBias 1660249466 13 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357101_ZeroBias 1660249466 9 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357102_ZeroBias 1660249466 9 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357102_ZeroBias 1660249466 1 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357106_ZeroBias 1660249466 1 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357106_ZeroBias 1660249466 2 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk PromptReco_Run357112_ZeroBias 1660249466 2 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS PromptReco_Run357112_ZeroBias 1660249466 18 rows selected.

It is obvious that this block would continue entering the output of the original query every time a PromptReco completes until the result of the original query does not give exactly 18 records as the one above.

And indeed the expected behavior is achieved once we have the field dbsbuffer_workflow.name , which is further splitting the groups, removed.

Here is the query without the extra splitting and without the constraint for all the associated workflows to be completed as before:

SQL> SELECT 2 count(*), 3 dbsbuffer_block.blockname, 4 dbsbuffer_location.pnn, 5 dbsbuffer_dataset.path, 6 dbsbuffer_dataset_subscription.site, 7 dbsbuffer_block.create_time 8 FROM dbsbuffer_dataset_subscription 9 INNER JOIN dbsbuffer_dataset ON 10 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id 11 INNER JOIN dbsbuffer_block ON 12 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id 13 INNER JOIN dbsbuffer_file ON 14 dbsbuffer_file.block_id = dbsbuffer_block.id 15 INNER JOIN dbsbuffer_workflow ON 16 dbsbuffer_workflow.id = dbsbuffer_file.workflow 17 INNER JOIN dbsbuffer_location ON 18 dbsbuffer_location.id = dbsbuffer_block.location 19 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1 20 AND dbsbuffer_dataset_subscription.subscribed = 1 21 AND dbsbuffer_block.status = 'Closed' 22 AND dbsbuffer_block.deleted = 0 23 AND dbsbuffer_block.blockname = '/ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00' 24 GROUP BY dbsbuffer_block.blockname, 25 dbsbuffer_location.pnn, 26 dbsbuffer_dataset.path, 27 dbsbuffer_dataset_subscription.site, 28 dbsbuffer_block.create_time ; COUNT(*) BLOCKNAME PNN PATH SITE CREATE_TIME ---------- ------------------------------------------------------------------------------------------------------------------------ --------------- ---------------------------------------- ------------------ ----------- 201 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_Disk 1660249466 201 /ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00 T0_CH_CERN_Disk /ZeroBias/Run2022C-PromptReco-v1/AOD T1_US_FNAL_MSS 1660249466

And here is the result if we apply the constraint for workflow completion.

SQL> SELECT 2 count(*), 3 dbsbuffer_block.blockname, 4 dbsbuffer_location.pnn, 5 dbsbuffer_dataset.path, 6 dbsbuffer_dataset_subscription.site, 7 dbsbuffer_block.create_time 8 FROM dbsbuffer_dataset_subscription 9 INNER JOIN dbsbuffer_dataset ON 10 dbsbuffer_dataset.id = dbsbuffer_dataset_subscription.dataset_id 11 INNER JOIN dbsbuffer_block ON 12 dbsbuffer_block.dataset_id = dbsbuffer_dataset_subscription.dataset_id 13 INNER JOIN dbsbuffer_file ON 14 dbsbuffer_file.block_id = dbsbuffer_block.id 15 INNER JOIN dbsbuffer_workflow ON 16 dbsbuffer_workflow.id = dbsbuffer_file.workflow 17 INNER JOIN dbsbuffer_location ON 18 dbsbuffer_location.id = dbsbuffer_block.location 19 WHERE dbsbuffer_dataset_subscription.delete_blocks = 1 20 AND dbsbuffer_dataset_subscription.subscribed = 1 21 AND dbsbuffer_block.status = 'Closed' 22 AND dbsbuffer_block.deleted = 0 23 AND dbsbuffer_block.blockname = '/ZeroBias/Run2022C-PromptReco-v1/AOD#d54f99e8-9d28-4567-921f-edce2a207f00' 24 GROUP BY dbsbuffer_block.blockname, 25 dbsbuffer_location.pnn, 26 dbsbuffer_dataset.path, 27 dbsbuffer_dataset_subscription.site, 28 dbsbuffer_block.create_time 29 HAVING COUNT(*) = SUM(dbsbuffer_workflow.completed) ; no rows selected

It returns no rows as expected, because there are still workflows associated with this block which are not completed.

I am afraid this same faulty logic applies for the extra granularity added by the multiple sites as well as with the multiple associated workflows

todor-ivanov · 2022-08-12T09:45:51Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+        dictResults = {}
+        for record in listResults:
+            # Populates results dict and adds all workflows of the same block to a single record
+            blockname = record['blockname']


Please use cammelCase for the python variables.

todor-ivanov · 2022-08-12T10:00:07Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+        for record in listResults:
+            # Populates results dict and adds all workflows of the same block to a single record
+            blockname = record['blockname']
+            if blockname in dictResults:


In general the DAO must return the list of results as it is from the DABASE and no extra data reformatting should be needed. It seems to me that if this extra aggregation by block name here in the python code is needed, then something is not quite working as we expect in the GROUP BY clause in the SQL query.

We can do this aggregation outside of the DAO. Also, we may be able to modify the query so it ignores blocks with blocks that are not completed, but then that's probably redundant with the deletableWorkflows query

Thanks @germanfgv! We largely discussed the problem an tested multiple times with you, and we now know in details what is going on. I'd suggest we continue as it is. And also since this DAO is used at a single place in the WMCore code, my previous remark that we should avoid the aggregation in the DAO and leave it at the component would only make the code at the component more unreadable. So lets proceed with what we already have. Just put the comments, as we discussed during our meeting, mentioning what are the background condition we rely on in order for that logic to work. So that we do remember those little but extremely important details. Thanks!

todor-ivanov · 2022-08-12T10:06:54Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+            blockname = record['blockname']
+            if blockname in dictResults:
+                dictResults[blockname]['workflowNames'].add(record['name'])
+                dictResults[blockname]['sites'].add(record['site'])


IIUC this line here would just add the site to the set of sites of an already existing block name, skipping the values of the rest of the columns in the record. I am not sure this is safe. Are we not losing information here?

the other columns are block name, location, dataset, and creation time. All of those are the same for each block, regardless of the subscription in the record.

As above, put this little detail at a comment in the code, please.

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

todor-ivanov · 2022-08-12T10:15:00Z

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

+            # Populates results dict and adds all workflows of the same block to a single record
+            blockname = record['blockname']
+            if blockname in dictResults:
+                dictResults[blockname]['workflowNames'].add(record['name'])


same comment as bellow.

cmsdmwmbot · 2022-08-16T13:31:22Z

Jenkins results:

Python3 Unit tests: succeeded
- 1 changes in unstable tests
Python3 Pylint check: succeeded
- 6 warnings
- 29 comments to review
Pylint py3k check: succeeded
Pycodestyle check: succeeded
- 4 comments to review

Details at https://cmssdt.cern.ch/dmwm-jenkins/view/All/job/DMWM-WMCore-PR-test/13507/artifact/artifacts/PullRequestReport.html

todor-ivanov

Thanks @germanfgv , please squash those commits in a single one so we can merge it.

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py

amaltaro · 2022-08-24T03:26:57Z

@germanfgv @todor-ivanov thanks for following this up to the end. In the future, please keep in mind that:

the label saying it's "work in progress" or "do not merge yet" should be removed by the developer, to flag that it's ready to go
commits must be squashed not to pollute the file history with unneeded commits.
Thanks!

todor-ivanov · 2022-08-24T07:57:02Z

Hi @amaltaro,

This must have been my fault. We agreed with @germanfgv for this PR he should skip the squash step because no unittest related commits were needed, but instead I was supposed to push the Squash and Merge button and let github do the job for us... Well, IIRC I did so, but obviously something did not go as expected.

amaltaro · 2022-08-24T10:43:37Z

Oh, looking at the master commit history, it indeed did the job and there is only 1 commit (I had looked only at the PR itself).

todor-ivanov · 2022-08-24T14:38:02Z

This actually is a good lesson not to touch this button any more (I will put some red tape on my screen right where it is :) )
I was actually expecting to see the squash && merge action to be performed also in the PR. I do consider having a single commit in the PR to be more readable for future reference. Next time we should squash the old fashion way.

germanfgv added the PR: Work in progress label Aug 5, 2022

germanfgv force-pushed the fixDeleteBlocks branch 3 times, most recently from a3b0d59 to 354acab Compare August 6, 2022 00:01

germanfgv force-pushed the fixDeleteBlocks branch from 354acab to 230ed22 Compare August 6, 2022 00:03

germanfgv force-pushed the fixDeleteBlocks branch from 230ed22 to 8c3cee9 Compare August 6, 2022 00:29

Combines block records in single dict that includes all associated wfs

14c9915

germanfgv force-pushed the fixDeleteBlocks branch from 8c3cee9 to 14c9915 Compare August 8, 2022 13:57

Remove complete workflows check

0bf4285

Add all sites to single block

e5a8c5b

todor-ivanov requested changes Aug 12, 2022

View reviewed changes

Documentation and styling fixes

a158f17

todor-ivanov approved these changes Aug 16, 2022

View reviewed changes

src/python/WMComponent/RucioInjector/Database/MySQL/GetCompletedBlocks.py Show resolved Hide resolved

todor-ivanov merged commit d813d61 into dmwm:master Aug 17, 2022

amaltaro removed the PR: Work in progress label Aug 24, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combines block records in single dict that includes all associated wfs #11245

Combines block records in single dict that includes all associated wfs #11245

germanfgv commented Aug 5, 2022 •

edited by todor-ivanov

Loading

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 8, 2022

amaltaro commented Aug 8, 2022

cmsdmwmbot commented Aug 10, 2022

cmsdmwmbot commented Aug 10, 2022

germanfgv commented Aug 12, 2022 •

edited

Loading

todor-ivanov left a comment

todor-ivanov Aug 12, 2022

germanfgv Aug 12, 2022

todor-ivanov Aug 12, 2022 •

edited

Loading

todor-ivanov Aug 12, 2022

todor-ivanov Aug 12, 2022

todor-ivanov Aug 12, 2022

todor-ivanov Aug 12, 2022

germanfgv Aug 12, 2022

todor-ivanov Aug 16, 2022

todor-ivanov Aug 12, 2022

germanfgv Aug 12, 2022

todor-ivanov Aug 16, 2022

todor-ivanov Aug 12, 2022

cmsdmwmbot commented Aug 16, 2022

todor-ivanov left a comment

amaltaro commented Aug 24, 2022

todor-ivanov commented Aug 24, 2022

amaltaro commented Aug 24, 2022

todor-ivanov commented Aug 24, 2022

Combines block records in single dict that includes all associated wfs #11245

Combines block records in single dict that includes all associated wfs #11245

Conversation

germanfgv commented Aug 5, 2022 • edited by todor-ivanov Loading

Status

Description

Is it backward compatible (if not, which system it affects?)

Related PRs

External dependencies / deployment changes

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 6, 2022

cmsdmwmbot commented Aug 8, 2022

amaltaro commented Aug 8, 2022

cmsdmwmbot commented Aug 10, 2022

cmsdmwmbot commented Aug 10, 2022

germanfgv commented Aug 12, 2022 • edited Loading

todor-ivanov left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

todor-ivanov Aug 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmsdmwmbot commented Aug 16, 2022

todor-ivanov left a comment

Choose a reason for hiding this comment

amaltaro commented Aug 24, 2022

todor-ivanov commented Aug 24, 2022

amaltaro commented Aug 24, 2022

todor-ivanov commented Aug 24, 2022

germanfgv commented Aug 5, 2022 •

edited by todor-ivanov

Loading

germanfgv commented Aug 12, 2022 •

edited

Loading

todor-ivanov Aug 12, 2022 •

edited

Loading