Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate all the partial pileup features together and run final pre-production tests #11736

Closed
amaltaro opened this issue Sep 21, 2023 · 51 comments · Fixed by #11948, #11971, #11972 or #11975
Closed

Comments

@amaltaro
Copy link
Contributor

amaltaro commented Sep 21, 2023

Impact of the new feature
WMCore in general

Is your feature request related to a problem? Please describe.
At this stage, we are supposed to have the whole partial pileup functionality in place, with dev environment tests already done

Describe the solution you'd like
Carefully run integration level tests with all the partial pileup sub-features in place, ensuring that:

  • MSPileup behavior is correct
  • global workqueue has the correct behavior
  • global workqueue cherrypy thread can successfully update pileup location
  • WMAgent WorkQueueManager behavior is correct
  • WMAgent WorkflowUpdater behavior is correct

Describe alternatives you've considered
None

Additional context
This is part of the meta issue: #11537

**** UPDATE ****
The following pull requests have been provided to resolve issues identified during this integration and validation phase:
#11948
#11971
#11972
#11975

@amaltaro
Copy link
Contributor Author

amaltaro commented Mar 5, 2024

Now we should have all the required development already merged in and available in a given tag.

For central services, partial pileup feature is fully implemented starting in 2.3.1 (deployed in testbed and production); while for WMAgent it is available starting in 2.3.2rc2 (deployed today in vocms0193).

@vkuznet
Copy link
Contributor

vkuznet commented Mar 5, 2024

Alan, before taking care of this issue please provide additional context to supplement description of the ticket. My understanding that testing will require the following:

  • create pileup document which can be used for tracing the entire pipeline, therefore I need to know how to do it, i.e.
    • how to inject new doc, if so which parameters to use
    • do we need to manually change existing document, if so should we change containerFraction to smaller and then larger value
  • I understand how to check MSPileup flow, but I doubt I know what to check in GWQ and WMA (last four items in your description list). Please provide description what should be checked in GWQ and WMA during these tests.

@amaltaro
Copy link
Contributor Author

There is extra work to be done here, so reopening this ticket.

@vkuznet
Copy link
Contributor

vkuznet commented Mar 26, 2024

I successfully tested (using 2.3.2rc5 release candidate) transition of the following dataset /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM using new MSPileupTasks logic. Here are transition records:

  "transition": [
    {
      "containerFraction": 1.0,
      "customDID": "/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM",
      "updateTime": 1709576366,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore Service Account"
    },
    {
      "containerFraction": 0.5,
      "customDID": "/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V1",
      "updateTime": 1711458638,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov"
    },
    {
      "containerFraction": 0.8,
      "customDID": "/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2",
      "updateTime": 1711462240,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov"
    }
  ]

The testing was done using REST API by changing the containerFraction from 1.0->0.5-0.8.

I'll repeat now tests using 2.3.2rc6 release candidate and another dataset /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM.

@vkuznet
Copy link
Contributor

vkuznet commented Mar 27, 2024

Update1:

  • I changed fraction of /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX dataset to 0.5
  • the rucio command lists 6 dids for it in default cms scope and 3 dids for its -V1 (group.wmcore scope), therefore it correctly scaled to 1/2 for V1 container
  • I changed fraction to 1.0 and checked with rucio, now it lists 6 dids in cms scope and 6 dids in -V2 container with group.wmcore one, i.e. it correctly scale back to all dids for custom container.

I hope this is sufficient tests to declare working MSPileup/MSPileupTasks. @amaltaro please provide your feedback on this observation.

@amaltaro
Copy link
Contributor Author

Yes, I think this is good enough for MSPileup. We still have to see how global and local workqueue behave with these changes though. For that, I would suggest assigning a workflow with pileup to a team name that does not exist, such that workqueue elements will be forever in Available status (that way we can test both the workqueue creation and location update).

We also need to hear back from @todor-ivanov on the WMAgent validation (of partial pileup).

@todor-ivanov
Copy link
Contributor

hi @amaltaro @vkuznet

For that, I would suggest assigning a workflow with pileup to a team name that does not exist, such that workqueue elements will be forever in Available status (that way we can test both the workqueue creation and location update).

Here is your workflow with nonexistent team name: https://cmsweb-testbed.cern.ch/reqmgr2/fetch?rid=tivanov_SC_MultiPU_Feb2024_Val_PartPU_v3_240330_073439_5439

@amaltaro amaltaro moved this from Done to In Progress in WMCore quarterly developments Apr 8, 2024
@todor-ivanov
Copy link
Contributor

todor-ivanov commented Apr 15, 2024

hi @amaltaro
I can confirm the GWQ successfully updates a reduced pilup fraction in the DataLocationManager :

  • PU: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM,
  • Fraction: 1.0
2024-04-15 08:36:09,176:INFO:DataLocationMapper:Fetching locations from MSPileup for 1
2024-04-15 08:36:09,290:INFO:DataLocationMapper:locationsFromPileup - name: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX, currentRSEs: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], containerFraction: 1.0
2024-04-15 08:36:09,291:INFO:DataLocationMapper:Fetching locations from MSPileup for 4
2024-04-15 08:36:09,372:INFO:DataLocationMapper:locationsFromPileup - name: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, currentRSEs: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], containerFraction: 1.0
2024-04-15 08:36:09,432:INFO:DataLocationMapper:locationsFromPileup - name: /RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM, currentRSEs: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], containerFraction: 1.0
2024-04-15 08:36:09,482:INFO:DataLocationMapper:locationsFromPileup - name: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM, currentRSEs: ['T2_CH_CERN'], containerFraction: 1.0
2024-04-15 08:36:09,554:INFO:DataLocationMapper:locationsFromPileup - name: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM, currentRSEs: ['T2_CH_CERN'], containerFraction: 0.8
  • PU: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM,
  • Fraction: 0.5
2024-04-15 13:49:33,195:INFO:DataLocationMapper:Fetching locations from MSPileup for 1
2024-04-15 13:49:33,287:INFO:DataLocationMapper:locationsFromPileup - name: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX, currentRSEs: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], containerFraction: 1.0
2024-04-15 13:49:33,288:INFO:DataLocationMapper:Fetching locations from MSPileup for 4
2024-04-15 13:49:33,474:INFO:DataLocationMapper:locationsFromPileup - name: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, currentRSEs: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], containerFraction: 1.0
2024-04-15 13:49:33,575:INFO:DataLocationMapper:locationsFromPileup - name: /RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM, currentRSEs: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], containerFraction: 1.0
2024-04-15 13:49:33,674:INFO:DataLocationMapper:locationsFromPileup - name: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM, currentRSEs: ['T2_CH_CERN'], containerFraction: 0.5
2024-04-15 13:49:33,773:INFO:DataLocationMapper:locationsFromPileup - name: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM, currentRSEs: ['T2_CH_CERN'], containerFraction: 0.8
  • MSPilup transition:
$ scurl https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup?pileupName=/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM
{"result": [
 {
  "pileupName": "/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM",
  "pileupType": "classic",
  "insertTime": 1680873642,
  "lastUpdateTime": 1713188213,
  "expectedRSEs": [
    "T2_CH_CERN"
  ],
  "currentRSEs": [
    "T2_CH_CERN"
  ],
  "fullReplicas": 1,
  "campaigns": [
    "Apr2023_Val"
  ],
  "containerFraction": 0.5,
  "replicationGrouping": "ALL",
  "activatedOn": 1713188213,
  "deactivatedOn": 1680873642,
  "active": true,
  "pileupSize": 19882212302,
  "ruleIds": [
    "5a9170d282364b15afe8b4ed1e856c41"
  ],
  "customName": "/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V1",
  "transition": [
    {
      "containerFraction": 1.0,
      "customDID": "/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM",
      "updateTime": 1709576366,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore Service Account"
    },
    {
      "containerFraction": 0.5,
      "customDID": "/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V1",
      "updateTime": 1713188213,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1"
    }
  ]
}]}

@todor-ivanov
Copy link
Contributor

The workflow used for testing Local work Queue was: https://cmsweb-testbed.cern.ch/reqmgr2/fetch?rid=tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883

The LWQ logs for injecting it to WMBS:

2024-04-15 15:55:04,761:139973019985664:INFO:Fileset:Fileset created: /tivanov_SC_MultiPU_Feb2024_Val_PartPU_v4_240415_132857_8917/GenSimFull/unmerged-logArchive
2024-04-15 15:55:04,797:139973019985664:INFO:WMBSHelper:"tivanov_SC_MultiPU_Feb2024_Val_PartPU_v4_240415_132857_8917" Injecting production 1:1:1 - 1:7:9000 (run:lumi:event) into wmbs
2024-04-15 15:55:05,425:139973019985664:INFO:WMBSHelper:Transaction committed: True, for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v4_240415_132857_8917
2024-04-15 15:55:05,426:139973019985664:INFO:WorkQueue:Created top level subscription 1401 for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v4_240415_132857_8917 with 1 files
2024-04-15 15:55:05,459:139973019985664:INFO:WorkQueue:LQE 24f8ea94463e88331e93a3e8e370d87b set to 'Running' for request tivanov_SC_MultiPU_Feb2024_Val_PartPU_v4_240415_132857_8917
2024-04-15 15:55:05,527:139973019985664:INFO:WorkQueue:Running WMBS preparation for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883 with ParentQueueId a0d2d3570a8443a11a0ce7c0f47d9ee0,
  with common location ['T3_CH_CERNBOX', 'T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-15 15:55:05,792:139973019985664:INFO:Fileset:Fileset created: tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-GenSimFull-cc60ed1982a1ae1d1c5f7831c8ace7af
2024-04-15 15:55:13,567:139973019985664:INFO:SandboxCreator:Created sandbox tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2 with size 6409778
2024-04-15 15:55:13,573:139973019985664:INFO:Workflow:Workflow id 1405 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,579:139973019985664:INFO:WMBSHelper:Top level subscription 1405 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,583:139973019985664:INFO:Fileset:Fileset created: /tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/GenSimFull/unmerged-FEVTDEBUGoutputGEN-SIM
2024-04-15 15:55:13,588:139973019985664:INFO:Fileset:Fileset created: /tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/GenSimFull/unmerged-FEVTDEBUGHLToutputGEN-SIM-DIGI-RAW
2024-04-15 15:55:13,592:139973019985664:INFO:Workflow:Workflow id 1406 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,597:139973019985664:INFO:WMBSHelper:Child subscription 1406 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,601:139973019985664:INFO:Fileset:Fileset created: /tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/GenSimFull/Digi_2024MergeFEVTDEBUGHLToutput/merged-MergedGEN-SIM-DIGI-RAW
2024-04-15 15:55:13,604:139973019985664:INFO:Workflow:Workflow id 1407 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,607:139973019985664:INFO:WMBSHelper:Child subscription 1407 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,614:139973019985664:INFO:Fileset:Fileset created: /tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/GenSimFull/Digi_2024MergeFEVTDEBUGHLToutput/merged-logArchive
2024-04-15 15:55:13,631:139973019985664:INFO:Workflow:Workflow id 1408 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,634:139973019985664:INFO:WMBSHelper:Child subscription 1408 created for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,639:139973019985664:INFO:Fileset:Fileset created: /tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/GenSimFull/unmerged-logArchive
2024-04-15 15:55:13,645:139973019985664:INFO:WMBSHelper:"tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883" Injecting production 1:1:1 - 1:7:9000 (run:lumi:event) into wmbs
2024-04-15 15:55:13,690:139973019985664:INFO:WMBSHelper:Transaction committed: True, for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,691:139973019985664:INFO:WorkQueue:Created top level subscription 1405 for tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883 with 1 files
2024-04-15 15:55:13,720:139973019985664:INFO:WorkQueue:LQE a0d2d3570a8443a11a0ce7c0f47d9ee0 set to 'Running' for request tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 15:55:13,809:139973019985664:INFO:WorkQueue:Injected 2 out of 2 units into WMBS

@todor-ivanov
Copy link
Contributor

Agent's WorkflowUpdater component logs:

2024-04-15 16:12:00,781:139783570618112:INFO:WorkflowUpdaterPoller:Workflow: tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883 requires pileup dataset(s): dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])
2024-04-15 16:12:00,786:139783570618112:INFO:WorkflowUpdaterPoller:Workflow: tivanov_SC_MultiPU_Feb2024_Val_PartPU_v4_240415_132857_8917 requires pileup dataset(s): dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])
2024-04-15 16:12:00,787:139783570618112:INFO:WorkflowUpdaterPoller:There are 2 pileup workflows out of 2 active workflows.
2024-04-15 16:12:00,847:139783570618112:INFO:WorkflowUpdaterPoller:A total of 17 pileup documents have been retrieved.
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM, custom name: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2, expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, custom name: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2, expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM, custom name: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V1, expected at: ['T2_CH_CERN'], but currently available at: ['T2_CH_CERN']
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM, custom name: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM-V2, expected at: ['T2_CH_CERN'], but currently available at: ['T2_CH_CERN']
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_13p6TeV-pythia8/Run3Winter23GS-126X_mcRun3_2023_forPU65_v1-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/Phase2Fall22GS-HCalDetIDFix_125X_mcRun4_realistic_v2-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_11_1_2_patch3-110X_mcRun4_realistic_v3_2026D49noPU_BSzpz35-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-15 16:12:00,848:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_13p6TeV-pythia8/Run3Summer22GS-124X_mcRun3_2022_realistic_v10-v1/GEN-SIM, custom name: , expected at: ['T2_CH_CERN', 'T1_US_FNAL_Disk'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/Run3Summer21PrePremix-Winter22_122X_mcRun3_2021_realistic_v9-v1/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/PhaseIISpring22GS-123X_mcRun4_realistic_v11-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer17PrePremix-PUAutumn18_102X_upgrade2018_realistic_v15-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer17PrePremix-MCv2_correctPU_94X_mc2017_realistic_v9-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIIFall17FSPrePremix-PUMoriond17_94X_mc2017_realistic_v15-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL17_106X_mc2017_realistic_v6-v3/PREMIX, custom name: , expected at: ['T2_CH_CERN', 'T1_US_FNAL_Disk'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-15 16:12:00,849:139783570618112:INFO:Timers:Rucio block resolution took 0.0 seconds to complete
2024-04-15 16:12:00,849:139783570618112:INFO:WorkflowUpdaterPoller:Processing workflow {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883', 'spec': '/data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/WMWorkload.pkl', 'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])}, sandbox: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 16:12:01,006:139783570618112:INFO:WorkflowUpdaterPoller:Found pileup name /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX under path: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/GenSimFull/cmsRun2/pileupconf.json
2024-04-15 16:12:01,006:139783570618112:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-15 16:12:01,006:139783570618112:INFO:WorkflowUpdaterPoller:Mark WMSandbox/GenSimFull/cmsRun2/pileupconf.json to be updated in tarball /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2 with a fresh pileup content
2024-04-15 16:12:01,016:139783570618112:INFO:WorkflowUpdaterPoller:Found pileup name /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM under path: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/GenSimFull/cmsRun1/pileupconf.json
2024-04-15 16:12:01,018:139783570618112:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-15 16:12:01,018:139783570618112:INFO:WorkflowUpdaterPoller:Mark WMSandbox/GenSimFull/cmsRun1/pileupconf.json to be updated in tarball /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2 with a fresh pileup content
2024-04-15 16:12:01,018:139783570618112:INFO:WorkflowUpdaterPoller:Write pileup configuration file /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2
2024-04-15 16:12:02,152:139783570618112:INFO:WorkflowUpdaterPoller:Updating pileup file at {jname} for workflow tarball: {tfile}
2024-04-15 16:12:02,157:139783570618112:INFO:WorkflowUpdaterPoller:Updating pileup file at {jname} for workflow tarball: {tfile}
2024-04-15 16:12:02,950:139783570618112:INFO:WorkflowUpdaterPoller:Done updating spec: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/WMWorkload.pkl


@todor-ivanov
Copy link
Contributor

Upon changing the Pileup fraction of /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM again, raising it to 0.7, I restarted both

  • Centrally: GWQ
  • WMAgent: WorkflowUpdater component

And checked if the workflows data have been updated a the agent. Here are the logs from WorkflowUpdater:

2024-04-15 19:17:33,617:139643003078400:INFO:WorkflowUpdaterPoller:Processing workflow {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883', 'spec': '/data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/WMWorkload.pkl', 'pileup': dict_values([{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX', '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}])}, sandbox: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883
2024-04-15 19:17:33,765:139643003078400:INFO:WorkflowUpdaterPoller:Found pileup name /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX under path: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/GenSimFull/cmsRun2/pileupconf.json
2024-04-15 19:17:33,766:139643003078400:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-15 19:17:33,766:139643003078400:INFO:WorkflowUpdaterPoller:Mark WMSandbox/GenSimFull/cmsRun2/pileupconf.json to be updated in tarball /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2 with a fresh pileup content
2024-04-15 19:17:33,775:139643003078400:INFO:WorkflowUpdaterPoller:Found pileup name /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM under path: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/GenSimFull/cmsRun1/pileupconf.json
2024-04-15 19:17:33,776:139643003078400:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-15 19:17:33,776:139643003078400:INFO:WorkflowUpdaterPoller:Mark WMSandbox/GenSimFull/cmsRun1/pileupconf.json to be updated in tarball /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2 with a fresh pileup content
2024-04-15 19:17:33,776:139643003078400:INFO:WorkflowUpdaterPoller:Write pileup configuration file /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2
2024-04-15 19:17:34,096:139643003078400:INFO:WorkflowUpdaterPoller:Updating pileup file at {jname} for workflow tarball: {tfile}
2024-04-15 19:17:34,096:139643003078400:INFO:WorkflowUpdaterPoller:Updating pileup file at {jname} for workflow tarball: {tfile}
2024-04-15 19:17:34,839:139643003078400:INFO:WorkflowUpdaterPoller:Done updating spec: /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/WMWorkload.pkl

So as one can see the Sandbox is updated .... log messages: ...Mark WMSandbox/GenSimFull/cmsRun2/pileupconf.json to be updated in tarball... and later: Write pileup configuration file /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2

With that I can say we are good to go.
@amaltaro Please feel free to deploy and enable the new functionality in production!

@todor-ivanov
Copy link
Contributor

@vkuznet @amaltaro , just a minor question here before we enable this in production tomorrow.

After I've checked all the logs and confirmed that the logic is triggered upon a p fraction update as explained by the documentation, I decided to also do a double check of the so updated pileupconf.json inside the tarbal.
And here is what I find. The pileupconf.json contents outside the sandbox:

cmst1@vocms0193:validatePartialPU $ cat /data/srv/wmagent/v2.3.2rc4/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/WMSandbox/GenSimFull/cmsRun1/pileupconf.json |json_reformat 
{
    "mc": {
        "/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM#b9304f4c-5efd-49e5-bb85-c1f15eb4a1ad": {
            "FileList": [
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/1533e2c7-a094-4801-bfe2-738be7b61666.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/a9ddec9d-9b82-4b14-8f0b-332b444e0142.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/0cddbe68-a3fd-4dfa-9c36-6ad463f5798a.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/ae114559-3d74-44d1-a830-65d62ceb22ce.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/aa1ed5c7-f1b7-431a-9eab-29633fc5dcce.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/e316bd35-b3ff-49d1-91e1-7c542ae0bb27.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/4b810820-95b7-452a-9b15-b00072fca766.root",
                "/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/2679854b-65b7-4f1d-b20f-804878aef1ab.root"
            ],
            "NumberOfEvents": 90000,
            "PhEDExNodeNames": [

            ]
        }
    }
}

But what I can see in the sandbox is this:

cp -rav /data/srv/wmagent/current/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883/tivanov_SC_MultiPU_Feb2024_Val_
_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2  /data/srv/wmagent/validatePartialPU/sandbox/

cd /data/srv/wmagent/validatePartialPU/sandbox/
tar -jxvf tivanov_SC_MultiPU_Feb2024_Val_PartPU_v5_240415_132937_6883-Sandbox.tar.bz2 

cmst1@vocms0193:sandbox $  cat ./WMSandbox/GenSimFull/cmsRun1/pileupconf.json |json_reformat 
{
    "mc": {

    }
}
cmst1@vocms0193:sandbox $ cat ./WMSandbox/GenSimFull/cmsRun2/pileupconf.json |json_reformat 
{
    "mc": {

    }
}

But the workflow completed successfully. Could that be attributed to the fact I have edited the pilup fraction at the time the jobs have been already running at condor?

@vkuznet
Copy link
Contributor

vkuznet commented Apr 16, 2024

@todor-ivanov , there are several conditions for content modification of pileupconf.json file. Since you didn't provide further details, i.e. node, log, etc., it is hard to say exactly if the observed behavior is correct or not. On a safe part I provided unit tests to check content of pileupconf.json and all of them were fine.

I'll leave a final call to @amaltaro , may be we can rerun again the test with a different workflow. In that case I suggest to provide details of the node, logs, etc. that we can look at it.

@todor-ivanov
Copy link
Contributor

todor-ivanov commented Apr 24, 2024

And I was not wrong.... even before I get my second test fully initiated. I did see the empty pileupconfig.json file:

cmst1@vocms0193:current $ cat /data/srv/wmagent/validatePartialPU/2.3.2rc8/after/WorkQueuManager/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083/WMSandboxUntar/WMSandbox/GenSimFull/cmsRun1/pileupconf.json 
{"mc": {}}

And I think I know where the error comes from. I am now running this block of code in standalone mode at vocms0193:

for pileupDoc in msPileupList:
# check if active pileup workflow is found in MSPileup one
pileupName = pileupDoc['pileupName']
if pileupName == jsonPUName:
# then we need to check whether there are any changes or not
jsonBlockLoc = blockLocations(puJsonContent)
# construct new data-structure:
# - dict of blocks and rses mapping, it will be used by checkChanges
msPUBlockLoc = {}
for blk in pileupDoc["blocks"]:
msPUBlockLoc[blk] = pileupDoc["rses"]
# are the block locations different between the JSON and MSPileup?
# jsonBlockLoc and msPUBlockLoc have identical data-structures
# {block1: [rses], block2: [rses], ...}
if checkChanges(jsonBlockLoc, msPUBlockLoc):
self.logger.info("Found differences between JSON and MSPileup content.")
puNewJsonContent = updateBlockInfo(puJsonContent, msPUBlockLoc, self.dbsUrl, self.logger)
# we should update a tarball only once for each pileup name,
# therefore we add new entry to jdict with our pilupe conf file
if puNewJsonContent:
# we update json file if we get new pileup content
jdict[puConfJson] = puNewJsonContent
self.logger.info("Mark %s to be updated in tarball %s with a fresh pileup content",
puConfJson, tarFile)
else:
self.logger.warning("updateBlockInfo did not return any results for %s, will skip update of pileup json content", pileupName)
else:
msg = "### There are no differences between JSON and MSPileup content "
msg += f"for pileup name {pileupName}. Not updating anything!"
self.logger.info(msg)

And here is the sequence we enter:

>>> pprint(pileupDoc) 
{'activatedOn': 1713947015,
 'active': True,
 'campaigns': ['RunIISummer20UL18DIGIPremix'],
 'containerFraction': 1.0,
 'currentRSEs': [],
 'customName': '',
 'deactivatedOn': 1681995421,
 'expectedRSEs': ['T1_US_FNAL_Disk', 'T2_CH_CERN'],
 'fullReplicas': 1,
 'insertTime': 1681925466,
 'lastUpdateTime': 1713947015,
 'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX',
 'pileupSize': 0,
 'pileupType': 'premix',
 'replicationGrouping': 'ALL',
 'ruleIds': [],
 'transition': [{'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore '
                       'Service Account',
                 'containerFraction': 1.0,
                 'customDID': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX',
                 'updateTime': 1709576366}]}
>>> for pileupDoc in msPileupList: 
...     pileupName = pileupDoc['pileupName']
...     if pileupName == jsonPUName:
...             print(pileupName)
...             break
... 
/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM
>>> pprint(pileupDoc)
{'activatedOn': 1713947014,
 'active': True,
 'campaigns': ['Apr2023_Val'],
 'containerFraction': 0.5,
 'currentRSEs': ['T2_CH_CERN'],
 'customName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4',
 'deactivatedOn': 1680873642,
 'expectedRSEs': ['T2_CH_CERN'],
 'fullReplicas': 1,
 'insertTime': 1680873642,
 'lastUpdateTime': 1713947014,
 'pileupName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 'pileupSize': 19882212302,
 'pileupType': 'classic',
 'replicationGrouping': 'ALL',
 'ruleIds': ['9c561d6f0e5b459483d89702dec17ade'],
 'transition': [{'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore '
                       'Service Account',
                 'containerFraction': 1.0,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
                 'updateTime': 1709576366},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.5,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V1',
                 'updateTime': 1713188213},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.5,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V2',
                 'updateTime': 1713190262},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.7,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V3',
                 'updateTime': 1713191815},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.5,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4',
                 'updateTime': 1713795758}]}
>>> jsonBlockLoc = blockLocations(puJsonContent)
>>> msPUBlockLoc = {}
>>> for blk in pileupDoc["blocks"]:
...     msPUBlockLoc[blk] = pileupDoc["rses"]
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'blocks'
>>> pprint(msPUBlockLoc)
{}
>>> puNewJsonContent = updateBlockInfo(puJsonContent, msPUBlockLoc, config.WorkflowUpdater.dbsUrl, logger)
>>> pprint(puNewJsonContent)
{'mc': {}}
>>> 
>>> if puNewJsonContent:
...     print(errMsg)
... 
We Will always have a zero-content Json, but nevertheless we will always update the pileupconfig.json file with: '{mc: {}}'!!!

While the list of pileup documents we obtain in the following way:

>>> from WMCore.Services.MSPileup.MSPileupUtils import getPileupDocs
>>> msPileupList = getPileupDocs(config.WorkflowUpdater.msPileupUrl, queryDict={}, method='GET')
>>> pprint(msPileupList)
[{'activatedOn': 1713947014,
  'active': True,
  'campaigns': ['Apr2023_Val'],
  'containerFraction': 0.8,
  'currentRSEs': ['T1_US_FNAL_Disk', 'T2_CH_CERN'],
  'customName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2',
  'deactivatedOn': 1680873642,
  'expectedRSEs': ['T1_US_FNAL_Disk', 'T2_CH_CERN'],
  'fullReplicas': 1,
  'insertTime': 1680873642,
  'lastUpdateTime': 1713947014,
  'pileupName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM',
  'pileupSize': 1233099715874,
  'pileupType': 'classic',
  'replicationGrouping': 'ALL',
  'ruleIds': ['0e68e51c47f2462dae0e086641f9d162',
              'f795b21cbb9b4da8b482afa9a0626f8d'],
  'transition': [{'DN': '/DC=ch/DC=cern/OU=Organic '
                        'Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore '
                        'Service Account',
                  'containerFraction': 1.0,
                  'customDID': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM',
                  'updateTime': 1709576366},
                 {'DN': '/DC=ch/DC=cern/OU=Organic '
                        'Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y '
                        'Kuznetsov',
                  'containerFraction': 0.5,
                  'customDID': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V1',
                  'updateTime': 1711458638},
                 {'DN': '/DC=ch/DC=cern/OU=Organic '
                        'Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y '
                        'Kuznetsov',
                  'containerFraction': 0.8,
                  'customDID': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2',
                  'updateTime': 1711462240}]},
 {'activatedOn': 1713947014,
  'active': True,
  'campaigns': ['RunIISummer20UL16DIGIPremix',
                'RunIISummer20UL16DIGIPremixAPV'],
  'containerFraction': 1.0,
  'currentRSEs': ['T1_US_FNAL_Disk', 'T2_CH_CERN'],
  'customName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2',
  'deactivatedOn': 1681995420,
  'expectedRSEs': ['T1_US_FNAL_Disk', 'T2_CH_CERN'],
  'fullReplicas': 1,
  'insertTime': 1680873642,
  'lastUpdateTime': 1713947014,
  'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  'pileupSize': 5129627746347,
  'pileupType': 'premix',
  'replicationGrouping': 'ALL',
  'ruleIds': ['245b4ad457e04296b5310063fd36f84d',
              '409b6b3c7a50426586351b0007a65ef9'],
  'transition': [{'DN': '/DC=ch/DC=cern/OU=Organic '
                        'Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore '
                        'Service Account',
                  'containerFraction': 1.0,
                  'customDID': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
                  'updateTime': 1709576366},
                 {'DN': '/DC=ch/DC=cern/OU=Organic '
                        'Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y '
                        'Kuznetsov',
                  'containerFraction': 0.5,
                  'customDID': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V1',
                  'updateTime': 1711481138},
                 {'DN': '/DC=ch/DC=cern/OU=Organic '
                        'Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y '
                        'Kuznetsov',
                  'containerFraction': 1.0,
                  'customDID': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2',
                  'updateTime': 1711542358}]},
...
}

So long story short, we are missing the blocks field in all of those pileup records as they come from msPileup (which we fetch and preserve in the msPileupList). And once we get to the point of creating the current msPUBlockLoc dictionary, we end up with a KeyErr, and hence we will always have : msPUBlockLoc = {}. This will later hit us, when we try to fetch the updated block info, by always getting empty result from updateBlockInfo function. And this leads us to the second part of this misbehavior: No matter what we send as input to updateBlockInfo, it ends up returning the minimal dictionary {'mc': {}} even if it gets no update from upstream services. And this breaks the symmetry of this if puNewJsonContent condition 😉 and forces it to always fall into the state of updated tar file with a zero-content document.

This leads us to the second

@vkuznet
Copy link
Contributor

vkuznet commented Apr 24, 2024

Todor, first and foremost, thanks for looking into this and comprehensive output. I'm trying to follow your logic and still fail. If we look at the code from https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py (I'll only use line numbers):

But the function getPileupDocs https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py#L416 must construct a new structure with blocks in each element of the list. Therefore, I don't know if your explanation is correct.

If we get {'mc': {}} then it seems it should come from updateBlockInfo function which defines empty dict at line https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py#L153 and it may never be initialized within two for loops. Therefore, it seems to me we need to add a check to line https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py#L173 like this:

if newDict:
   returnDict[puType] = newDict

to prevent creating {'mc': {}}. What do you think?

@vkuznet
Copy link
Contributor

vkuznet commented Apr 24, 2024

The suggested fix from me is here: #11974 I would appreciate if both of you @amaltaro and @todor-ivanov will have a look at this. In particular Todor's comment #11736 (comment) and my reply #11736 (comment)

@todor-ivanov
Copy link
Contributor

Hi @vkuznet
This :

if newDict:
    returnDict[puType] = newDict

Was my initial idea of fixing the second part of the issue as well. But then I realized this way we would change the structure of the returned object: We are about to pass to updateBlockInfo an object of the form:

{'mc': {'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM#b9304f4c-5efd-49e5-bb85-c1f15eb4a1ad': {'FileList': ['/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/1533e2c7-a094-4801-bfe2-738be7b61666.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/a9ddec9d-9b82-4b14-8f0b-332b444e0142.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/0cddbe68-a3fd-4dfa-9c36-6ad463f5798a.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/ae114559-3d74-44d1-a830-65d62ceb22ce.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/aa1ed5c7-f1b7-431a-9eab-29633fc5dcce.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/e316bd35-b3ff-49d1-91e1-7c542ae0bb27.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/4b810820-95b7-452a-9b15-b00072fca766.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/2679854b-65b7-4f1d-b20f-804878aef1ab.root'],
                                                                                                                                         'NumberOfEvents': 90000,
                                                                                                                                         'PhEDExNodeNames': []}}}

And we are about to return an empty {} so destroying the structure of the initial document. To me returning a dictionary of the form: {'mc':{}} makes a lot of sense. But, If we do not care about the part of preserving the initial structure of the document this would a be perfect fix for the zero valued document.

On the other part, I am interested to know why we try to access a field in the pileup document which does not exists:

{'activatedOn': 1713947014,
 'active': True,
 'campaigns': ['Apr2023_Val'],
 'containerFraction': 0.5,
 'currentRSEs': ['T2_CH_CERN'],
 'customName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4',
 'deactivatedOn': 1680873642,
 'expectedRSEs': ['T2_CH_CERN'],
 'fullReplicas': 1,
 'insertTime': 1680873642,
 'lastUpdateTime': 1713947014,
 'pileupName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 'pileupSize': 19882212302,
 'pileupType': 'classic',
 'replicationGrouping': 'ALL',
 'ruleIds': ['9c561d6f0e5b459483d89702dec17ade'],
 'transition': [{'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore '
                       'Service Account',
                 'containerFraction': 1.0,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
                 'updateTime': 1709576366},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.5,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V1',
                 'updateTime': 1713188213},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.5,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V2',
                 'updateTime': 1713190262},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.7,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V3',
                 'updateTime': 1713191815},
                {'DN': '/DC=ch/DC=cern/OU=Organic '
                       'Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1',
                 'containerFraction': 0.5,
                 'customDID': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4',
                 'updateTime': 1713795758}]}
>>> jsonBlockLoc = blockLocations(puJsonContent)
>>> msPUBlockLoc = {}
>>> for blk in pileupDoc["blocks"]:
...     msPUBlockLoc[blk] = pileupDoc["rses"]
... 
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'blocks'
>>> pprint(msPUBlockLoc)
{}

And I checked in the current implementation this key is missing for all pileups known to the service

@vkuznet
Copy link
Contributor

vkuznet commented Apr 24, 2024

Todor, the MSPileup document does not have blocks. I already pointed out that in code we don't use this document per-se but rather create a different structure from it which contains the blocks attribute. Please have a look at how getPileupDocs function is implemented and what it returns, see https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py#L416-L445. It gets MSPileup document, but then it creates a new data-structure with blocks.

@todor-ivanov
Copy link
Contributor

thanks @vkuznet and @amaltaro , actually Alan and you both pointed to one and the same place:

But the function getPileupDocs https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py#L416 must construct a new structure with blocks in each element of the list. Therefore, I don't know if your explanation is correct.

So the blocks in the Pu documents are created here:

and later populated here:

with CodeTimer("Rucio block resolution", logger=logging):
self.findRucioBlocks(uniqueActivePU, msPileupList)

This was the part I was missing. and instead I was using the upper level method for getPileupDocs() not the method define within WworkflowUpdaterPoller class. So this explains why in my tests I was missing the block fields. In this case your fix should be working perfectly Valya. I'll patch and test later tonight.

@todor-ivanov
Copy link
Contributor

vocms0193 is now patched and the proposed fix at #11974 is under test.

Here is the test workflow: SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943

@todor-ivanov
Copy link
Contributor

HI @vkuznet @amaltaro, Here are the results:
This time I chose to edit the bigger PU, which consists of more than 1 block: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX

I changed its fraction to 0.7 with the following command:

cmst1@vocms0193:current $ scurl -X PUT -H "Content-type: application/json" -d '{ "pileupName": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX", "containerFraction": 0.7 }'   https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup
{"result": [
]}

And here is the resultant record at MSPileup:

cmst1@vocms0193:current $ scurl https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup?pileupName=/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX
{"result": [
 {
  "pileupName": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX",
  "pileupType": "premix",
  "insertTime": 1680873642,
  "lastUpdateTime": 1714042130,
  "expectedRSEs": [
    "T1_US_FNAL_Disk",
    "T2_CH_CERN"
  ],
  "currentRSEs": [
    "T1_US_FNAL_Disk",
    "T2_CH_CERN"
  ],
  "fullReplicas": 1,
  "campaigns": [
    "RunIISummer20UL16DIGIPremix",
    "RunIISummer20UL16DIGIPremixAPV"
  ],
  "containerFraction": 0.7,
  "replicationGrouping": "ALL",
  "activatedOn": 1714042130,
  "deactivatedOn": 1681995420,
  "active": true,
  "pileupSize": 5129627746347,
  "ruleIds": [
    "245b4ad457e04296b5310063fd36f84d",
    "409b6b3c7a50426586351b0007a65ef9"
  ],
  "customName": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2",
  "transition": [
    {
      "containerFraction": 1.0,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX",
      "updateTime": 1709576366,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=msunmer/CN=852819/CN=Robot: WmCore Service Account"
    },
    {
      "containerFraction": 0.5,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V1",
      "updateTime": 1711481138,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov"
    },
    {
      "containerFraction": 1.0,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2",
      "updateTime": 1711542358,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=valya/CN=443502/CN=Valentin Y Kuznetsov"
    },
    {
      "containerFraction": 1.0,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3",
      "updateTime": 1714042130,
      "DN": "/DC=ch/DC=cern/OU=Organic Units/OU=Users/CN=cmst1/CN=718748/CN=Robot: cms t1"
    }
  ]
}]}

Then, restarted the WorkflowUpdater component, which did notice the change and triggered:

2024-04-25 12:49:25,456:140047524083456:INFO:WorkflowUpdaterPoller:Running Workflow updater injector poller algorithm...
2024-04-25 12:49:25,489:140047524083456:INFO:WorkflowUpdaterPoller:Workflow: tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943 requires pileup dataset(s): dict_values([{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX', '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcR
un3_2024_realistic_v10_forTrk-v1/GEN-SIM'}])
2024-04-25 12:49:25,489:140047524083456:INFO:WorkflowUpdaterPoller:There are 1 pileup workflows out of 1 active workflows.
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:A total of 17 pileup documents have been retrieved.
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM, custom name: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2, expected at: ['T1_US_FNAL_Di
sk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, custom name: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2, expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but 
currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM, custom name: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4, expected at: ['T2_CH_CERN'], but currently avail
able at: ['T2_CH_CERN']
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM, custom name: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM-V2, expected at: ['T2_CH_CERN'], but currently available at: ['T2_CH
_CERN']
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_13p6TeV-pythia8/Run3Winter23GS-126X_mcRun3_2023_forPU65_v1-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/Phase2Fall22GS-HCalDetIDFix_125X_mcRun4_realistic_v2-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-25 12:49:25,559:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_11_1_2_patch3-110X_mcRun4_realistic_v3_2026D49noPU_BSzpz35-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_13p6TeV-pythia8/Run3Summer22GS-124X_mcRun3_2022_realistic_v10-v1/GEN-SIM, custom name: , expected at: ['T2_CH_CERN', 'T1_US_FNAL_Disk'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/Run3Summer21PrePremix-Winter22_122X_mcRun3_2021_realistic_v9-v1/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but curr
ently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/PhaseIISpring22GS-123X_mcRun4_realistic_v11-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer17PrePremix-PUAutumn18_102X_upgrade2018_realistic_v15-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer17PrePremix-MCv2_correctPU_94X_mc2017_realistic_v9-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIIFall17FSPrePremix-PUMoriond17_94X_mc2017_realistic_v15-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T2_CH_CERN'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL17_106X_mc2017_realistic_v6-v3/PREMIX, custom name: , expected at: ['T2_CH_CERN', 'T1_US_FNAL_Disk'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-25 12:49:25,560:140047524083456:INFO:Timers:Rucio block resolution took 0.0 seconds to complete
2024-04-25 12:49:25,560:140047524083456:INFO:WorkflowUpdaterPoller:Processing workflow {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943', 'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl', 'pileup': dict_values([{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX', '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}])}, sandbox: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943
2024-04-25 12:49:25,696:140047524083456:INFO:WorkflowUpdaterPoller:Found pileup name /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX under path: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/GenSimFull/cmsRun2/pileupconf.json
2024-04-25 12:49:25,697:140047524083456:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-25 12:49:25,697:140047524083456:WARNING:WorkflowUpdaterPoller:updateBlockInfo did not return any results for /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, will skip update of pileup json content
2024-04-25 12:49:25,705:140047524083456:INFO:WorkflowUpdaterPoller:Found pileup name /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM under path: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/GenSimFull/cmsRun1/pileupconf.json
2024-04-25 12:49:25,706:140047524083456:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-25 12:49:25,706:140047524083456:WARNING:WorkflowUpdaterPoller:updateBlockInfo did not return any results for /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM, will skip update of pileup json content
2024-04-25 12:49:25,706:140047524083456:INFO:WorkflowUpdaterPoller:Done updating spec: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl

But with the current change, it did not update the Sandbox tarball.... because it did not find any block information yet again. Here is the log message from the blocked step:

2024-04-25 12:49:25,696:140047524083456:INFO:WorkflowUpdaterPoller:Found pileup name /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX under path: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/GenSimFull/cmsRun2/pileupconf.json
2024-04-25 12:49:25,697:140047524083456:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-25 12:49:25,697:140047524083456:WARNING:WorkflowUpdaterPoller:updateBlockInfo did not return any results for /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, will skip update of pileup json content

And I am comparing the two tarbals (before and after the edit). I can see that we indeed protected ourselves from ending up with empty pilupconfig.json files inside the sandbox, but I am not sure , why updateBlockInfo does not find any block related information. I am running everything interactively... and in standalone mode ..... yet again....

@todor-ivanov
Copy link
Contributor

hi @vkuznet @amaltaro,
I am observing a strange behavior. While I am running the logic in adjustJSONSpec line by line in a standalone instance of the component I get everything correctly:

In [74]: wfup.dbsUrl
Out[74]: 'https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader'

In [75]: checkChanges(jsonBlockLoc, msPUBlockLoc)
Out[75]: True

In [76]: puNewJsonContent = updateBlockInfo(puJsonContent, msPUBlockLoc, wfup.dbsUrl, wfup.logger)

In [77]: puNewJsonContent
Out[77]: 
{'mc': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04': {'FileList': [
'/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/270020/B38CAA37-89E0-374D-B25E-64A8A4642502.root',    
'/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/270020/73504EE7-61BC-964C-901F-6CFEA76892D6.root',
...
   'NumberOfEvents': 444800,
   'PhEDExNodeNames': ['T1_US_FNAL_Disk', 'T2_CH_CERN']},
  '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b': {'FileList': ['/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/250000/9928EDDE-B931-2240-8D23-BE8FC8A5AAB0.root'],
   'NumberOfEvents': 1600,
   'PhEDExNodeNames': ['T1_US_FNAL_Disk', 'T2_CH_CERN']}}}


In [82]: msPUBlockLoc
Out[82]: 
{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb': ['T1_US_FNAL_Disk',
  'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b': ['T1_US_FNAL_Disk',
  'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#004d5f64-f280-4bad-b88b-4b6e52ea040d': ['T1_US_FNAL_Disk',
  'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04': ['T1_US_FNAL_Disk',
  'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#01c4b6b1-4fb9-4400-8627-7d044e069ff3': ['T1_US_FNAL_Disk',
  'T2_CH_CERN']}

But if I get to change the WorkflowUpdaterPoller:updateBlockInfo function just a little to observe what are the parametrs it is called with, I can definitely see the msPUBlockLoc as an empty dictionary during the call:

2024-04-26 11:01:38,406:140072198670080:INFO:WorkflowUpdaterPoller:updateBlockInfo called with: 
dbsUrl: https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader, 
msPUBlockLoc: {}, 
jdoc: {'mc': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb': {'FileList': ['/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/130026/12EEB66B-2B7C-3146-8927-AEB3273CC21D.root',
                                                                                                                                                    '/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/130009/522C82F9-C527-6A4A-981C-B6E815EE132C.root',
...

Same for the single block PU as well:

2024-04-26 11:01:38,856:140072198670080:INFO:WorkflowUpdaterPoller:updateBlockInfo called with: 
dbsUrl: https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader, 
msPUBlockLoc: {}, 
jdoc: {'mc': {'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM#b9304f4c-5efd-49e5-bb85-c1f15eb4a1ad': {'FileList': ['/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/1533e2c7-a094-4801-bfe2-738be7b61666.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/a9ddec9d-9b82-4b14-8f0b-332b444e0142.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/0cddbe68-a3fd-4dfa-9c36-6ad463f5798a.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/ae114559-3d74-44d1-a830-65d62ceb22ce.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/aa1ed5c7-f1b7-431a-9eab-29633fc5dcce.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/e316bd35-b3ff-49d1-91e1-7c542ae0bb27.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/4b810820-95b7-452a-9b15-b00072fca766.root',
                                                                                                                                                      '/store/relval/CMSSW_11_2_0_pre8/RelValMinBias_14TeV/GEN-SIM/112X_mcRun3_2024_realistic_v10_forTrk-v1/00000/2679854b-65b7-4f1d-b20f-804878aef1ab.root'],
                                                                                                                                         'NumberOfEvents': 90000,
                                                                                                                                         'PhEDExNodeNames': []}}} 

@todor-ivanov
Copy link
Contributor

and indeed here is the shorter version of these messages printed from adjustJSONSoec:

2024-04-26 11:43:59,908:140690461001472:INFO:WorkflowUpdaterPoller:pileupDoc: {'blocks': [],
 'customName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3',
 'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']}
2024-04-26 11:43:59,908:140690461001472:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-26 11:43:59,908:140690461001472:INFO:WorkflowUpdaterPoller:updateBlockInfo called with: 
dbsUrl: https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader, 
msPUBlockLoc: {} 

msPUBlock is empty, because for some reason the block list at the pileupDoc is empty

@todor-ivanov
Copy link
Contributor

Moving painfully slow ahead:
Here is the result after the agent tries to resolve all block level information from Rucio:

2024-04-26 12:20:29,644:140043158759168:INFO:Timers:Rucio block resolution took 0.0 seconds to complete
2024-04-26 12:20:29,646:140043158759168:INFO:WorkflowUpdaterPoller:adjustJSONSpec called with msPileupList: [{'blocks': [],
  'customName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2',
  'pileupName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM',
  'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']},
 {'blocks': [],
  'customName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3',
  'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']},
 {'blocks': [],
  'customName': '',
  'pileupName': '/RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM',
  'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']},
 {'blocks': [],
  'customName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4',
  'pileupName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
  'rses': ['T2_CH_CERN']},
 {'blocks': [],
  'customName': '/RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM-V2',
  'pileupName': '/RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM',
  'rses': ['T2_CH_CERN']},
 {'blocks': [],
  'customName': '',
  'pileupName': '/MinBias_TuneCP5_13p6TeV-pythia8/Run3Winter23GS-126X_mcRun3_2023_forPU65_v1-v1/GEN-SIM',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/MinBias_TuneCP5_14TeV-pythia8/Phase2Fall22GS-HCalDetIDFix_125X_mcRun4_realistic_v2-v1/GEN-SIM',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/RelValMinBias_14TeV/CMSSW_11_1_2_patch3-110X_mcRun4_realistic_v3_2026D49noPU_BSzpz35-v1/GEN-SIM',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/MinBias_TuneCP5_13p6TeV-pythia8/Run3Summer22GS-124X_mcRun3_2022_realistic_v10-v1/GEN-SIM',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/Run3Summer21PrePremix-Winter22_122X_mcRun3_2021_realistic_v9-v1/PREMIX',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIISpring22GS-123X_mcRun4_realistic_v11-v1/GEN-SIM',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/RunIISummer17PrePremix-PUAutumn18_102X_upgrade2018_realistic_v15-v1/GEN-SIM-DIGI-RAW',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/RunIISummer17PrePremix-MCv2_correctPU_94X_mc2017_realistic_v9-v1/GEN-SIM-DIGI-RAW',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/RunIIFall17FSPrePremix-PUMoriond17_94X_mc2017_realistic_v15-v1/GEN-SIM-DIGI-RAW',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL17_106X_mc2017_realistic_v6-v3/PREMIX',
  'rses': []},
 {'blocks': [],
  'customName': '',
  'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX',
  'rses': []}]

And as can clearly be seen - it takes 0.0 secs and the blocks for all of them are empty.... and no error or what so ever has been thrown. So no indication on what has happened in the background.

@todor-ivanov
Copy link
Contributor

I think I know what the problem is.
We are hitting a data structure mismatch here:

# no active workflow requires this pileup

Upon adding the following printout at this very line:

self.logger.info(f"Skipping pileupItem: {pformat(pileupItem)}")

Here is what I see in the logs:

2024-04-26 12:36:26,761:140667067643648:INFO:WorkflowUpdaterPoller:Skipping pileupItem: {'blocks': [],
 'customName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2',
 'pileupName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM',
 'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']}
2024-04-26 12:36:26,761:140667067643648:INFO:WorkflowUpdaterPoller:Skipping pileupItem: {'blocks': [],
 'customName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3',
 'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']}
2024-04-26 12:36:26,761:140667067643648:INFO:WorkflowUpdaterPoller:Skipping pileupItem: {'blocks': [],
 'customName': '',
 'pileupName': '/RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM',
 'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']}
2024-04-26 12:36:26,761:140667067643648:INFO:WorkflowUpdaterPoller:Skipping pileupItem: {'blocks': [],
 'customName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4',
 'pileupName': '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 'rses': ['T2_CH_CERN']}
...

while the whole uniquePUList as produced in the main algorythm here:

uniqueActivePU = flattenList([item['pileup'] for item in puWflows])
has the following form:

2024-04-26 12:36:26,717:140667067643648:INFO:WorkflowUpdaterPoller:algorithm produced 
uniqueActivePU: [{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}]

@todor-ivanov
Copy link
Contributor

ok, getting closer here:

Here is ho uniqueActivePU is generated out of puWflows. This is how the parent obj looks at the beginning:

puWflows: [{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943',
  'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}]),
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602',
  'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}]),
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072',
  'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}]),
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083',
  'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}]),
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085',
  'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}]),
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085/WMSandbox/WMWorkload.pkl'}]

Applying line:

uniqueActivePU = flattenList([item['pileup'] for item in puWflows])

Here is how the child objects looks like:

2024-04-26 12:57:30,283:139639548790528:INFO:WorkflowUpdaterPoller:algorithm produced 
uniqueActivePU: [{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}]

So it is obvious, the result is neither unique nor a flat list. And cross checking everything manually (up to now it was all just live printouts from the code at the agent), we can see that there is no way to match a string value in a list of something, which looks like a set of string values:

In [91]: uniqueActivePU = flattenList([item['pileup'] for item in puWflows])

In [92]: '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX' in uniqueActivePU
Out[92]: False

In [93]: {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX', '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}
    ...: in uniqueActivePU
Out[93]: True

While the whole object from inside the list matches. So we are actually never updating any block level information from Rucio at all. I believe this whole is due to the heavily nested structure coming from the WorkLoadHelper:

def listPileupDatasets(self, initialTask=None):
"""
_listPileUpDataset_
Returns a dictionary with all the required pile-up datasets
in this workload and their associated dbs url as the key
"""
pileupDatasets = {}
if initialTask:
taskIterator = initialTask.childTaskIterator()
else:
taskIterator = self.taskIterator()
for task in taskIterator:
for stepName in task.listAllStepNames():
stepHelper = task.getStepHelper(stepName)
if stepHelper.stepType() == "CMSSW":
pileupSection = stepHelper.getPileup()
if pileupSection is None:
continue
dbsUrl = stepHelper.data.dbsUrl
if dbsUrl not in pileupDatasets:
pileupDatasets[dbsUrl] = set()
for pileupType in pileupSection.listSections_():
datasets = getattr(getattr(stepHelper.data.pileup, pileupType), "dataset")
pileupDatasets[dbsUrl].update(datasets)
pileupDatasets.update(self.listPileupDatasets(task))
return pileupDatasets
which is called here from inside the WorkflowUpdaterPoler:
pileupSpecs = workloadHelper.listPileupDatasets()

@todor-ivanov
Copy link
Contributor

todor-ivanov commented Apr 26, 2024

and indeed the values from workloadHelper.listPileupDatasets() are returned in the form of a set instead of a single valued (string) records or lists:

pileupDatasets[dbsUrl] = set()

and checking it manually confirms it:

In [96]: wflow
Out[96]: 
{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943',
 'spec': '/data/WMAgent.venv3/validatePartialPU/2.3.2rc8/before/WorkQueuManager/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl',
 'pileup': dict_values([{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX', '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}])}

In [97]: workloadHelper = WMWorkloadHelper()

In [98]: workloadHelper.load(wflow['spec'])

In [99]: workloadHelper.listPileupDatasets()
Out[99]: 
{'https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}}

In [100]: type(workloadHelper.listPileupDatasets()['https://cmsweb-testbed.cern.ch/dbs/int/global/DBSReader'])
Out[100]: set

So we are flattening this list of sets wrongly and we are not getting a pure set which should be a union of all other nested sets inside uniqueActivePU

@amaltaro
Copy link
Contributor Author

@todor-ivanov I think you are looking at the wrong place. This has all been resolved and the place where we have a unique list of pileup dataset names is in this line:
https://github.com/dmwm/WMCore/blob/master/src/python/WMComponent/WorkflowUpdater/WorkflowUpdaterPoller.py#L296

You are looking at premature lines, which indeed have a different data structure than the one expected by self.findRucioBlocks. But as explained above, it matters not.

@todor-ivanov
Copy link
Contributor

Well I beg to disagree..... it matters and let me show you why.

It matters, because we are putting the values in the wflowSpec['pileup'] field not as a pure set, but rather as a view here:

wflowSpec['pileup'] = pileupSpecs.values()

And let me paste what I get out of this very line in:

  • one case (when the field is populated with the views) and what I get from the
  • other case (when the field is populate directly with the sets returned)
puWorflows :
 
[{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943', 
'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl', 
'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])}, 
{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602', 
'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602/WMSandbox/WMWorkload.pkl', 
'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])}, 
{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072', 
'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072/WMSandbox/WMWorkload.pkl', 
'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])}, 
{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083', 
'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083/WMSandbox/WMWorkload.pkl', 
'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])}, 
{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085', 
'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085/WMSandbox/WMWorkload.pkl', 
'pileup': dict_values([{'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}])}]

In [90]: flattenList([item['pileup'] for item in puWflows])
Out[90]: 
[{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}]

And now if I go and manually remove all this and make them be populated just as normal sets:

In [109]: pprint(puWflows)
[{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085/WMSandbox/WMWorkload.pkl'}]

In [110]: uniqueActivePU2 = flattenList([item['pileup'] for item in puWflows])

In [112]: uniqueActivePU2
Out[112]: 
['/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM']

find the difference ;) :

In [111]: uniqueActivePU
Out[111]: 
[{'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
 {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}]

In [112]: uniqueActivePU2
Out[112]: 
['/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM']

I believe the second one is the expected one in the rest of the code. I am preparing a patch.

@todor-ivanov
Copy link
Contributor

@amaltaro @vkuznet here is the solution to our problems.

All we need to do is to properly iterate trough the whole view as returned by ``. if we do so all our data structures get in order. Here are the printouts after applying my patch:

2024-04-26 16:24:34,131:140193479931648:INFO:WorkflowUpdaterPoller:There are 5 pileup workflows out of 5 active workflows.
2024-04-26 16:24:34,131:140193479931648:INFO:WorkflowUpdaterPoller:algorithm found 
puWflows: [{'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v10_240425_082716_7943/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v11_240426_080312_7602/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v7_240419_220203_5072/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v8_240422_110607_8083/WMSandbox/WMWorkload.pkl'},
 {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085',
  'pileup': {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
             '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'},
  'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v9_240424_072739_9085/WMSandbox/WMWorkload.pkl'}]
2024-04-26 16:24:34,131:140193479931648:INFO:WorkflowUpdaterPoller:algorithm produced 
uniqueActivePU: ['/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX']

(BTW this uniqueActivePU could be made a set itself)

And later in the process:

2024-04-26 16:24:34,269:140193479931648:INFO:WorkflowUpdaterPoller:adjustJSONSpec called with msPileupList: [{'blocks': [],
  'customName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2',
  'pileupName': '/MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM',
  'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']},
 {'blocks': ['/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb',
             '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b',
             '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#004d5f64-f280-4bad-b88b-4b6e52ea040d',
             '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04',
             '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#01c4b6b1-4fb9-4400-8627-7d044e069ff3'],
  'customName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3',
  'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
  'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']},
...
}

then the pilupDoc:

2024-04-26 16:24:34,387:140193479931648:INFO:WorkflowUpdaterPoller:pileupDoc: {'blocks': ['/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb',
            '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b',
            '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#004d5f64-f280-4bad-b88b-4b6e52ea040d',
            '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04',
            '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#01c4b6b1-4fb9-4400-8627-7d044e069ff3'],
 'customName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3',
 'pileupName': '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 'rses': ['T1_US_FNAL_Disk', 'T2_CH_CERN']}

And finally the msPUBlockLoc which was previously empty and was the root cause of the problems:

msPUBlockLoc: {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb': ['T1_US_FNAL_Disk',
                                                                                                                                'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b': ['T1_US_FNAL_Disk',
                                                                                                                                'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#004d5f64-f280-4bad-b88b-4b6e52ea040d': ['T1_US_FNAL_Disk',
                                                                                                                                'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04': ['T1_US_FNAL_Disk',
                                                                                                                                'T2_CH_CERN'],
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#01c4b6b1-4fb9-4400-8627-7d044e069ff3': ['T1_US_FNAL_Disk',
                                                                                                                                'T2_CH_CERN']} 

@todor-ivanov
Copy link
Contributor

indeed here is how this uniqueActivePU list should look like:

uniqueActivePU: {'/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM'}

instead of:

['/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM',
 '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX',
 '/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM']

@todor-ivanov
Copy link
Contributor

AAAnnnddd ... attention... attention all.... here is the golden fix: ;)

Tadaaaa:

#11975

@github-project-automation github-project-automation bot moved this from In Progress to Done in WMCore quarterly developments Apr 27, 2024
@amaltaro amaltaro reopened this Apr 27, 2024
@amaltaro amaltaro moved this from Done to In Progress in WMCore quarterly developments Apr 27, 2024
@todor-ivanov
Copy link
Contributor

After patching vocm0193, and sending yet another test workflow and changing the PU fraction, here is whate we get

cmst1@vocms0193:current $ scurl -X PUT -H "Content-type: application/json" -d '{ "pileupName": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX", "containerFraction": 0.5 }'   https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup
{"result": [
]}

cmst1@vocms0193:current $ scurl https://cmsweb-testbed.cern.ch/ms-pileup/data/pileup?pileupName=/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX
{"result": [
 {
  "pileupName": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX",
  "pileupType": "premix",
  "insertTime": 1680873642,
  "lastUpdateTime": 1714376817,
  "expectedRSEs": [
    "T1_US_FNAL_Disk",
    "T2_CH_CERN"
  ],
  "currentRSEs": [
    "T1_US_FNAL_Disk",
    "T2_CH_CERN"
  ],
  "fullReplicas": 1,
  "campaigns": [
    "RunIISummer20UL16DIGIPremix",
    "RunIISummer20UL16DIGIPremixAPV"
  ],
  "containerFraction": 0.5,
  "replicationGrouping": "ALL",
  "activatedOn": 1714376817,
  "deactivatedOn": 1681995420,
  "active": true,
  "pileupSize": 5103968941386,
  "ruleIds": [
    "a40f9f8905c64adfbdae25a486f07fbe",
    "a71b79f97e4b4bcc912928aa4b3905b4"
  ],
  "customName": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3",
  "transition": [
    {
      "containerFraction": 1.0,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX",
      "updateTime": 1709576366,
      "DN": "/DC=/DC=/OU=..."
    },
    {
      "containerFraction": 0.5,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V1",
      "updateTime": 1711481138,
      "DN": "/DC=/DC=/OU=..."
    },
    {
      "containerFraction": 1.0,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V2",
      "updateTime": 1711542358,
      "DN": "/DC=/DC=/OU=..."
    },
    {
      "containerFraction": 0.7,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3",
      "updateTime": 1714044261,
      "DN": "/DC=DC=/OU=..."
    },
    {
      "containerFraction": 0.7,
      "customDID": "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V4",
      "updateTime": 1714376817,
      "DN": "/DC=/DC=/OU=..."
    }
  ]
}]}

  • WorkflowUpdater logs:
2024-04-29 10:05:39,228:140480570136320:INFO:LogDB:<LogDB(url=https://cmsweb-testbed.cern.ch/couchdb/wmstats_logdb, identifier=vocms0193.cern.ch, agent=1)>
2024-04-29 10:05:39,229:140480570136320:INFO:BaseWorkerThread:Worker thread <WMComponent.WorkflowUpdater.WorkflowUpdaterPoller.WorkflowUpdaterPoller object at 0x7fc43ca77be0> started
2024-04-29 10:05:39,281:140480570136320:INFO:WorkflowUpdaterPoller:Running Workflow updater injector poller algorithm...
2024-04-29 10:05:39,298:140480570136320:INFO:WorkflowUpdaterPoller:Workflow: tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971 requires pileup dataset(s): {'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}
2024-04-29 10:05:39,299:140480570136320:INFO:WorkflowUpdaterPoller:There are 1 pileup workflows out of 1 active workflows.
2024-04-29 10:05:39,345:140480570136320:INFO:WorkflowUpdaterPoller:A total of 17 pileup documents have been retrieved.
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM, custom name: /MinBias_TuneCP5_14TeV-pythia8/PhaseIITDRSpring19GS-106X_upgrade2023_realistic_v2_ext1-v1/GEN-SIM-V2, expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX, custom name: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3, expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_10_6_1-106X_mcRun3_2021_realistic_v1_rsb-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: ['T1_US_FNAL_Disk', 'T2_CH_CERN']
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM, custom name: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4, expected at: ['T2_CH_CERN'], but currently available at: ['T2_CH_CERN']
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM, custom name: /RelValMinBias_14TeV/CMSSW_12_0_0_pre4-120X_mcRun3_2021_realistic_v2-v1/GEN-SIM-V2, expected at: ['T2_CH_CERN'], but currently available at: ['T2_CH_CERN']
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_13p6TeV-pythia8/Run3Winter23GS-126X_mcRun3_2023_forPU65_v1-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/Phase2Fall22GS-HCalDetIDFix_125X_mcRun4_realistic_v2-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /RelValMinBias_14TeV/CMSSW_11_1_2_patch3-110X_mcRun4_realistic_v3_2026D49noPU_BSzpz35-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_13p6TeV-pythia8/Run3Summer22GS-124X_mcRun3_2022_realistic_v10-v1/GEN-SIM, custom name: , expected at: ['T2_CH_CERN', 'T1_US_FNAL_Disk'], but currently available at: []
2024-04-29 10:05:39,346:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/Run3Summer21PrePremix-Winter22_122X_mcRun3_2021_realistic_v9-v1/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/Run3Summer21PrePremix-Summer22_124X_mcRun3_2022_realistic_v11-v2/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /MinBias_TuneCP5_14TeV-pythia8/PhaseIISpring22GS-123X_mcRun4_realistic_v11-v1/GEN-SIM, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer17PrePremix-PUAutumn18_102X_upgrade2018_realistic_v15-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer17PrePremix-MCv2_correctPU_94X_mc2017_realistic_v9-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T1_US_FNAL_Disk'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIIFall17FSPrePremix-PUMoriond17_94X_mc2017_realistic_v15-v1/GEN-SIM-DIGI-RAW, custom name: , expected at: ['T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL17_106X_mc2017_realistic_v6-v3/PREMIX, custom name: , expected at: ['T2_CH_CERN', 'T1_US_FNAL_Disk'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Pileup: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL18_106X_upgrade2018_realistic_v11_L1v1-v2/PREMIX, custom name: , expected at: ['T1_US_FNAL_Disk', 'T2_CH_CERN'], but currently available at: []
2024-04-29 10:05:39,347:140480570136320:INFO:WorkflowUpdaterPoller:Fetching blocks for custom pileup container: /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX-V3
2024-04-29 10:05:39,573:140480570136320:INFO:WorkflowUpdaterPoller:Fetching blocks for custom pileup container: /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM-V4
2024-04-29 10:05:39,591:140480570136320:INFO:Timers:Rucio block resolution took 0.243 seconds to complete
2024-04-29 10:05:39,591:140480570136320:INFO:WorkflowUpdaterPoller:Processing workflow {'name': 'tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971', 'spec': '/data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/WMSandbox/WMWorkload.pkl', 'pileup': {'/RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM', '/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX'}}, sandbox: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971
2024-04-29 10:05:39,725:140480570136320:INFO:WorkflowUpdaterPoller:Found pileup name /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX under path: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/WMSandbox/GenSimFull/cmsRun2/pileupconf.json
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Block /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04 has locations ['T1_US_FNAL_Disk', 'T2_CH_CERN'], updating to ['T1_US_FNAL_Disk', 'T2_CH_CERN'] from MSPileup
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Block /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#01c4b6b1-4fb9-4400-8627-7d044e069ff3 has locations ['T1_US_FNAL_Disk', 'T2_CH_CERN'], updating to ['T1_US_FNAL_Disk', 'T2_CH_CERN'] from MSPileup
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Block /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb has locations ['T1_US_FNAL_Disk', 'T2_CH_CERN'], updating to ['T1_US_FNAL_Disk', 'T2_CH_CERN'] from MSPileup
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Block /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#004d5f64-f280-4bad-b88b-4b6e52ea040d has locations ['T1_US_FNAL_Disk', 'T2_CH_CERN'], updating to ['T1_US_FNAL_Disk', 'T2_CH_CERN'] from MSPileup
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Block /Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b has locations ['T1_US_FNAL_Disk', 'T2_CH_CERN'], updating to ['T1_US_FNAL_Disk', 'T2_CH_CERN'] from MSPileup
2024-04-29 10:05:39,726:140480570136320:INFO:WorkflowUpdaterPoller:Mark WMSandbox/GenSimFull/cmsRun2/pileupconf.json to be updated in tarball /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971-Sandbox.tar.bz2 with a fresh pileup content
2024-04-29 10:05:39,733:140480570136320:INFO:WorkflowUpdaterPoller:Found pileup name /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM under path: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/WMSandbox/GenSimFull/cmsRun1/pileupconf.json
2024-04-29 10:05:39,734:140480570136320:INFO:WorkflowUpdaterPoller:Found differences between JSON and MSPileup content.
2024-04-29 10:05:39,734:140480570136320:INFO:WorkflowUpdaterPoller:Block /RelValMinBias_14TeV/CMSSW_11_2_0_pre8-112X_mcRun3_2024_realistic_v10_forTrk-v1/GEN-SIM#b9304f4c-5efd-49e5-bb85-c1f15eb4a1ad has locations [], updating to ['T2_CH_CERN'] from MSPileup
2024-04-29 10:05:39,735:140480570136320:INFO:WorkflowUpdaterPoller:Mark WMSandbox/GenSimFull/cmsRun1/pileupconf.json to be updated in tarball /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971-Sandbox.tar.bz2 with a fresh pileup content
2024-04-29 10:05:39,735:140480570136320:INFO:WorkflowUpdaterPoller:Write pileup configuration file /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971-Sandbox.tar.bz2
2024-04-29 10:05:40,707:140480570136320:INFO:WorkflowUpdaterPoller:Updating pileup file at WMSandbox/GenSimFull/cmsRun2/pileupconf.json for workflow tarball: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971-Sandbox.tar.bz2
2024-04-29 10:05:40,713:140480570136320:INFO:WorkflowUpdaterPoller:Updating pileup file at WMSandbox/GenSimFull/cmsRun1/pileupconf.json for workflow tarball: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971-Sandbox.tar.bz2
2024-04-29 10:05:41,520:140480570136320:INFO:WorkflowUpdaterPoller:Done updating spec: /data/srv/wmagent/v2.3.2rc8/install/wmagentpy3/WorkQueueManager/cache/tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971/WMSandbox/WMWorkload.pkl

2024-04-29 10:05:41,521:140480570136320:INFO:Timers:Adjust JSON spec took 1.93 seconds to complete
2024-04-29 10:05:41,521:140480570136320:INFO:BaseWorkerThread:WorkflowUpdaterPoller took 2.240 secs to execute
  • pilupConfig.json contents inside the tarball after changing the PU fraction:

cmst1@vocms0193:sandbox $ tar -xjvf tivanov_SC_MultiPU_Feb2024_Val_PartPU_v13_240429_062706_5971-Sandbox.tar.bz2
cmst1@vocms0193:sandbox $ cat WMSandbox/GenSimFull/cmsRun2/pileupconf.json  |jq |less 

{
  "mc": {
    "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00f89134-4cb9-4f91-bd6e-b71143c6ad04": {
      "FileList": [
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/270020/B38CAA37-89E0-374D-B25E-64A8A4642502.root",
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/270011/D693F097-C6F4-9245-8446-2EDF4FF7BCE5.root",
...
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/130000/512E6862-F732-3147-8F59-2BA034CF369F.root"
      ],
      "NumberOfEvents": 800000,
      "PhEDExNodeNames": [
        "T1_US_FNAL_Disk",
        "T2_CH_CERN"
      ]
    },
    "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00287791-198c-4000-aafa-a796878d51bb": {
      "FileList": [
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/130009/522C82F9-C527-6A4A-981C-B6E815EE132C.root",
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/130026/12EEB66B-2B7C-3146-8927-AEB3273CC21D.root",
...
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/260024/3B488A1B-D3C2-134F-A8C7-102AC2327D5C.root"
      ],
      "NumberOfEvents": 444800,
      "PhEDExNodeNames": [
        "T1_US_FNAL_Disk",
        "T2_CH_CERN"
      ]
    },
    "/Neutrino_E-10_gun/RunIISummer20ULPrePremix-UL16_106X_mcRun2_asymptotic_v13-v1/PREMIX#00339486-0bc7-4961-9e5f-a18020be075b": {
      "FileList": [
        "/store/mc/RunIISummer20ULPrePremix/Neutrino_E-10_gun/PREMIX/UL16_106X_mcRun2_asymptotic_v13-v1/250000/9928EDDE-B931-2240-8D23-BE8FC8A5AAB0.root"
      ],
      "NumberOfEvents": 1600,
      "PhEDExNodeNames": [
        "T1_US_FNAL_Disk",
        "T2_CH_CERN"
      ]
    }
  }
}

@amaltaro @vkuznet I think we are good to go. Alan feel free to deploy in production on your convenience. Enabling and usage of the functionality is a different story though. I remember noticing a message from P&R in this regard.

@amaltaro
Copy link
Contributor Author

Great! Thank you for the validation and confirmation, Todor. I am now cutting a final 2.3.2 release to be deployed in central services, perhaps today, and making a 2.3.3 final tag for WMAgent as well.

If there is nothing pending for this ticket, shall we close this out?

@todor-ivanov
Copy link
Contributor

Honor me with the pleasure to pull that plug, please!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment