Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPU TPC: Decoding: Add option to apply timebin cut to CTF cluster decoding on GPUs #13753

Merged
merged 2 commits into from
Dec 1, 2024

Conversation

cima22
Copy link
Contributor

@cima22 cima22 commented Nov 29, 2024

The configKeyValue GPU_proc.tpcApplyCFCutsAtDecoding=1; enables the filtering of TPC clusters at decoding time based on timebin. Compatible with both GPUs and CPU track model decoding. Still need to check performance, let's see what the CI says.

Copy link
Contributor

REQUEST FOR PRODUCTION RELEASES:
To request your PR to be included in production software, please add the corresponding labels called "async-" to your PR. Add the labels directly (if you have the permissions) or add a comment of the form (note that labels are separated by a ",")

+async-label <label1>, <label2>, !<label3> ...

This will add <label1> and <label2> and removes <label3>.

The following labels are available
async-2023-pbpb-apass4
async-2023-pp-apass4
async-2024-pp-apass1
async-2022-pp-apass7
async-2024-pp-cpass0
async-2024-PbPb-cpass0
async-2024-PbPb-apass1
async-2024-ppRef-apass1

@cima22
Copy link
Contributor Author

cima22 commented Nov 29, 2024

@davidrohr timebin cut showed to be 30% slower (both CPU and GPU versions). Is it acceptable?

@davidrohr
Copy link
Collaborator

davidrohr commented Nov 29, 2024 via email

@shahor02
Copy link
Collaborator

Is this for applying the cut in 10% of TFs or this is a cost per affected TF?

@davidrohr
Copy link
Collaborator

davidrohr commented Nov 29, 2024 via email

@shahor02
Copy link
Collaborator

@cima22 @davidrohr something is wrong (used dev + these 2 commits). 1st of all, running with GPU_proc.tpcApplyCFCutsAtDecoding=1 I got an error tpcApplyCFCutsAtDecoding currently requires tpcUseOldCPUDecoding. Commenting out this check in the GPU/GPUTracking/Global/GPUChainTracking.cxx I get no difference between the clusters from old standard processing and the one with GPU_proc.tpcApplyCFCutsAtDecoding=1 .
In opposite, if I use GPU_proc.tpcUseOldCPUDecoding=1;GPU_proc.tpcApplyCFCutsAtDecoding=1 I see probably random cluster cuts at any TB and in all TFs.

The plot shows the end of 1st 300 TFs of the run 559545, top pad: the TFs which should not have been touched (i.e. (tfCounter%10 != 1), bottom one: those with (tfCounter%10 == 1) where the clusters with TB>=14192 should have been removed, according to

root [0] auto& cc = o2::ccdb::BasicCCDBManager::instance();
root [1] auto xx = cc.get<o2::tpc::AltroSyncSignal>("TPC/Config/AltroSyncSignal");
root [2] xx->getTB2Cut(10)
(int) -1
root [3] xx->getTB2Cut(11)
(int) 14192

h0_.. is for standard processing (GPU_proc.tpcApplyCFCutsAtDecoding=0) , hc_.. for GPU_proc.tpcUseOldCPUDecoding=1;GPU_proc.tpcApplyCFCutsAtDecoding=1 and hg_.. for GPU_proc.tpcApplyCFCutsAtDecoding=1:

tstcut

This tarball contains the results of

grep gpu-recon STD/xx300.log | grep -E 'Processing timeslice|Event has' > std_cl.txt
grep gpu-recon CPU/rec.log | grep -E 'Processing timeslice|Event has' > oldcpu_cl.txt
grep gpu-recon GPU/rec.log | grep -E 'Processing timeslice|Event has' > gpu_cl.txt

you can see that the std_cl.txt and gpu_cl.txt are identical while oldcpu_cl.txt loses cluters at any TF:
tstgrp.tar.gz

@shahor02
Copy link
Collaborator

PS: for completeness, tried also with GPU_proc.tpcUseOldCPUDecoding=1; only, all TFs contain the same number of clusters as std_cl.txt (and gpu_cl.txt)

@cima22
Copy link
Contributor Author

cima22 commented Nov 30, 2024

@shahor02 Thank you very much for the detailed report.
I removed the tpcApplyCFCutsAtDecoding check for the new decoding as requested here:

I was hoping for 15 - 20% but it is ok. Could you for now remove the GPU_proc.applyCFCuts... check and rely on on the param.cutTPCTimeBin > 0 check? That will enable it by default if the CCDB object returns != -1. Later we need a different setting. Kind Regards David Rohr Sent from my mobile. (Excuse the typos!)

So it should only be triggered if the CCDB object returns != -1 for the timeBin parameter. Could you please try to run this test just with the new decoding using CPUs only, (tpcApplyCFCutsAtDecoding=0 and tpcUseOldCPUDecoding=0)?
If this works, it means that the parameter is not correctly passed to the GPU memory.

@davidrohr
Copy link
Collaborator

davidrohr commented Nov 30, 2024 via email

@davidrohr
Copy link
Collaborator

davidrohr commented Nov 30, 2024 via email

@shahor02
Copy link
Collaborator

@cima22
I've setup a script for fast check,
wfTPC.sh.gz
if you run

EXTRAOPT="GPU_proc.tpcApplyCFCutsAtDecoding=0;GPU_proc.tpcUseOldCPUDecoding=0" ./wfTPC.sh | tee xx.log

it will reconstruct 4 TFs (2 of which should be cut) and produce surviving clusters TBins for 2 categories. With the options you requested, I get
clcutres

@davidrohr the extraction of the option looks fine, though one could fetch to object from CCDB only once. I've hacked the code adding logging of the selected option (below) and get from the log file:

grep fetchCalibsCCDBTPC xx.log
[3150153:gpu-reconstruction]: [16:44:05][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=-1 overrideTPCTimeBinCur=0 for TFCounter 1
[3150153:gpu-reconstruction]: [16:44:07][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=14192 overrideTPCTimeBinCur=0 for TFCounter 11
[3150153:gpu-reconstruction]: [16:44:09][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=-1 overrideTPCTimeBinCur=0 for TFCounter 20
[3150153:gpu-reconstruction]: [16:44:11][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=14192 overrideTPCTimeBinCur=0 for TFCounter 21

diff --git a/GPU/Workflow/src/GPUWorkflowSpec.cxx b/GPU/Workflow/src/GPUWorkflowSpec.cxx
index 06942eab47..26f0fe689d 100644
--- a/GPU/Workflow/src/GPUWorkflowSpec.cxx
+++ b/GPU/Workflow/src/GPUWorkflowSpec.cxx
@@ -1054,6 +1054,7 @@ void GPURecoWorkflowSpec::doCalibUpdates(o2::framework::ProcessingContext& pc, c
     newCalibValues.tpcTimeBinCut = mConfig->configGRP.tpcCutTimeBin = mTPCCutAtTimeBin;
     needCalibUpdate = true;
   }
+  LOGP(info, "fetchCalibsCCDBTPC-> needCalibUpdate={} mTPCCutAtTimeBin={} overrideTPCTimeBinCur={} for TFCounter {}", needCalibUpdate, mTPCCutAtTimeBin, mConfParam->overrideTPCTimeBinCur, pc.services().get<o2::framework::TimingInfo>().tfCounter);
   if (needCalibUpdate) {
     LOG(info) << "Updating GPUReconstruction calibration objects";
     mGPUReco->UpdateCalibration(newCalibObjects, newCalibValues);

@davidrohr
Copy link
Collaborator

davidrohr commented Nov 30, 2024 via email

@shahor02
Copy link
Collaborator

No, it does not reach it:

[3162434:gpu-reconstruction]: [17:07:02][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=-1 overrideTPCTimeBinCur=0 for TFCounter 1
[3162434:gpu-reconstruction]: [17:07:02][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1
[3162434:gpu-reconstruction]: [17:07:04][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=14192 overrideTPCTimeBinCur=0 for TFCounter 11
[3162434:gpu-reconstruction]: [17:07:04][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1
[3162434:gpu-reconstruction]: [17:07:06][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=-1 overrideTPCTimeBinCur=0 for TFCounter 20
[3162434:gpu-reconstruction]: [17:07:06][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1
[3162434:gpu-reconstruction]: [17:07:08][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=14192 overrideTPCTimeBinCur=0 for TFCounter 21
[3162434:gpu-reconstruction]: [17:07:08][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1

with

diff --git a/GPU/GPUTracking/Global/GPUChainTrackingCompression.cxx b/GPU/GPUTracking/Global/GPUChainTrackingCompression.cxx
index 01e4d011d0..7ad1e55463 100644
--- a/GPU/GPUTracking/Global/GPUChainTrackingCompression.cxx
+++ b/GPU/GPUTracking/Global/GPUChainTrackingCompression.cxx
@@ -206,6 +206,7 @@ int32_t GPUChainTracking::RunTPCCompression()
 int32_t GPUChainTracking::RunTPCDecompression()
 {
 #ifdef GPUCA_HAVE_O2HEADERS
+  LOGP(info, "in RunTPCDecompression: param().tpcCutTimeBin={}", param().tpcCutTimeBin);
   if (GetProcessingSettings().tpcUseOldCPUDecoding) {
     const auto& threadContext = GetThreadContext();
     TPCClusterDecompressor decomp;
diff --git a/GPU/Workflow/src/GPUWorkflowSpec.cxx b/GPU/Workflow/src/GPUWorkflowSpec.cxx
index 06942eab47..26f0fe689d 100644
--- a/GPU/Workflow/src/GPUWorkflowSpec.cxx
+++ b/GPU/Workflow/src/GPUWorkflowSpec.cxx
@@ -1054,6 +1054,7 @@ void GPURecoWorkflowSpec::doCalibUpdates(o2::framework::ProcessingContext& pc, c
     newCalibValues.tpcTimeBinCut = mConfig->configGRP.tpcCutTimeBin = mTPCCutAtTimeBin;
     needCalibUpdate = true;
   }
+  LOGP(info, "fetchCalibsCCDBTPC-> needCalibUpdate={} mTPCCutAtTimeBin={} overrideTPCTimeBinCur={} for TFCounter {}", needCalibUpdate, mTPCCutAtTimeBin, mConfParam->overrideTPCTimeBinCur, pc.services().get<o2::framework::TimingInfo>().tfCounter);
   if (needCalibUpdate) {
     LOG(info) << "Updating GPUReconstruction calibration objects";
     mGPUReco->UpdateCalibration(newCalibObjects, newCalibValues);

@shahor02
Copy link
Collaborator

What I don't understand: the log is the same with

EXTRAOPT="GPU_proc.tpcApplyCFCutsAtDecoding=1;GPU_proc.tpcUseOldCPUDecoding=1" ./wfTPC.sh | tee xx.log

grep -E  'fetchCalibsCCDBTPC|RunTPCDecompression' xx.log 
[3164416:gpu-reconstruction]: [17:09:58][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=-1 overrideTPCTimeBinCur=0 for TFCounter 1
[3164416:gpu-reconstruction]: [17:09:58][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1
[3164416:gpu-reconstruction]: [17:10:00][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=14192 overrideTPCTimeBinCur=0 for TFCounter 11
[3164416:gpu-reconstruction]: [17:10:00][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1
[3164416:gpu-reconstruction]: [17:10:02][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=-1 overrideTPCTimeBinCur=0 for TFCounter 20
[3164416:gpu-reconstruction]: [17:10:02][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1
[3164416:gpu-reconstruction]: [17:10:05][INFO] fetchCalibsCCDBTPC-> needCalibUpdate=true mTPCCutAtTimeBin=14192 overrideTPCTimeBinCur=0 for TFCounter 21
[3164416:gpu-reconstruction]: [17:10:05][INFO] in RunTPCDecompression: param().tpcCutTimeBin=-1

but the statistics on both histos is different (with cut still not applied correctly):
clcutres11

@davidrohr
Copy link
Collaborator

davidrohr commented Nov 30, 2024 via email

@shahor02 shahor02 merged commit dc760aa into AliceO2Group:dev Dec 1, 2024
12 checks passed
@shahor02
Copy link
Collaborator

shahor02 commented Dec 1, 2024

With #13757 works fine:
clcutresFin

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants