Skip to content

Conversation

@fweig
Copy link
Contributor

@fweig fweig commented Nov 17, 2020

This PR adds an optional pad filter to the tpc cluster finder and two flags for filter pads based on number of total charges and number of consecutive charges.

@davidrohr This currently checks every pad in every fragment. Thats probably not necessary?

@davidrohr
Copy link
Collaborator

@davidrohr This currently checks every pad in every fragment. Thats probably not necessary?

Hm, good question. Did you check what is the performance overhead? In principle one could consider checking only the first segment of a timeframe, and then apply the result to the full time frame. That would probably be quite stable, I guess the behavior does not change that fast. Perhaps you can also make it configurable to check all fragments or only the first one.

For the parameters, I think we have to normalize them. Ideally they should be independent from the segment length. I'd rather define them as maxTimeBinAboveThresholdIn1000Bins and then scale with segment size, or something along that.

@davidrohr
Copy link
Collaborator

@fweig : Thx, now it compiles and should run correctly. Do you have an idea when you can address my other comment?

@fweig
Copy link
Contributor Author

fweig commented Nov 23, 2020

@davidrohr I'm on it. Hope to push the update today.

@fweig
Copy link
Contributor Author

fweig commented Nov 23, 2020

Pushed another update:

  • Renamed flags to maxTimeBinAboveThresholdIn1000Bin and maxConsecTimeBinAboveThreshold
  • Former is now scaled with fragment size
  • Added flag noisyPadsQuickCheck to check only the first fragment for noisy pads

For performance overhead I get about 30000 - 40000us per fragment across all sectors on the RTX 2080 ti.

@fweig fweig force-pushed the pad-baseline branch 2 times, most recently from c7ab6bf to 24762bb Compare November 23, 2020 21:02
davidrohr
davidrohr previously approved these changes Dec 2, 2020
Copy link
Collaborator

@davidrohr davidrohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks all OK, I have to say I am a bit puzzled by the time overhead. I had expected it to be less. Perhaps at some time one could double-check if this can be improved, but for not it is OK when I enable the quickcheck option.
There is zero overhead when the filter is disabled, so that looks good.
Could you rebase and apply defaults of 200/500 and enable the quick check (or just apply this patch):

diff --git a/GPU/GPUTracking/Base/GPUSettingsList.h b/GPU/GPUTracking/Base/GPUSettingsList.h
index 4af33abefac..6b385d6d97c 100644
--- a/GPU/GPUTracking/Base/GPUSettingsList.h
+++ b/GPU/GPUTracking/Base/GPUSettingsList.h
@@ -50,9 +50,9 @@ AddOptionRTC(tpcTubeMaxSize2, float, 2.5f * 2.5f, "", 0, "Square of max tube siz
 AddOptionRTC(trdMinTrackPt, float, .5f, "", 0, "Min Pt for tracks to be propagated through the TRD")
 AddOptionRTC(trdMaxChi2, float, 15.f, "", 0, "Max chi2 for TRD tracklets to be matched to a track")
 AddOptionRTC(trdPenaltyChi2, float, 12.f, "", 0, "Chi2 penalty for no available TRD tracklet (effective chi2 cut value)")
-AddOptionRTC(noisyPadsQuickCheck, unsigned char, 0, "", 0, "Only check first fragment for noisy pads instead of all fragments (when test is enabled).")
-AddOptionRTC(maxTimeBinAboveThresholdIn1000Bin, unsigned short, 0, "", 0, "Except pad from cluster finding if total number of charges in a fragment is above this baseline (disable = 0)")
-AddOptionRTC(maxConsecTimeBinAboveThreshold, unsigned short, 0, "", 0, "Except pad from cluster finding if number of consecutive charges in a fragment is above this baseline (disable = 0)")
+AddOptionRTC(noisyPadsQuickCheck, unsigned char, 1, "", 0, "Only check first fragment for noisy pads instead of all fragments (when test is enabled).")
+AddOptionRTC(maxTimeBinAboveThresholdIn1000Bin, unsigned short, 500, "", 0, "Except pad from cluster finding if total number of charges in a fragment is above this baseline (disable = 0)")
+AddOptionRTC(maxConsecTimeBinAboveThreshold, unsigned short, 200, "", 0, "Except pad from cluster finding if number of consecutive charges in a fragment is above this baseline (disable = 0)")
 AddOptionRTC(tpcCFqmaxCutoff, unsigned char, 3, "", 0, "Cluster Finder rejects cluster with qmax below this threshold")
 AddOptionRTC(tpcCFqtotCutoff, unsigned char, 0, "", 0, "Cluster Finder rejects cluster with qtot below this threshold")
 AddOptionRTC(tpcCFinnerThreshold, unsigned char, 0, "", 0, "Cluster Finder extends cluster if inner charge above this threshold")
diff --git a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx
index b7129626ea3..e812f5e00fb 100644
--- a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx
+++ b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx
@@ -26,7 +26,7 @@ GPUdii() void GPUTPCCFPeakFinder::Thread<0>(int nBlocks, int nThreads, int iBloc
 {
   Array2D<PackedCharge> chargeMap(reinterpret_cast<PackedCharge*>(clusterer.mPchargeMap));
   Array2D<uchar> isPeakMap(clusterer.mPpeakMap);
-  findPeaksImpl(get_num_groups(0), get_local_size(0), get_group_id(0), get_local_id(0), smem, chargeMap, clusterer.mPpadHasLostBaseline, clusterer.mPpositions, clusterer.mPmemory->counters.nPositions, clusterer.Param().rec, *clusterer.GetConstantMem()->calibObjects.tpcCalibration, clusterer.mPisPeak, isPeakMap);
+  findPeaksImpl(get_num_groups(0), get_local_size(0), get_group_id(0), get_local_id(0), smem, chargeMap, clusterer.mPpadHasLostBaseline, clusterer.mPpositions, clusterer.mPmemory->counters.nPositions, clusterer.Param().rec, *clusterer.GetConstantMem()->calibObjects.tpcPadGain, clusterer.mPisPeak, isPeakMap);
 }
 
 GPUdii() bool GPUTPCCFPeakFinder::isPeak(
@@ -95,7 +95,7 @@ GPUd() void GPUTPCCFPeakFinder::findPeaksImpl(int nBlocks, int nThreads, int iBl
                                               const ChargePos* positions,
                                               SizeT digitnum,
                                               const GPUSettingsRec& calib,
-                                              const TPCCFCalibration& gainCorrection, // Only used for globalPad() function
+                                              const TPCPadGainCalib& gainCorrection, // Only used for globalPad() function
                                               uchar* isPeakPredicate,
                                               Array2D<uchar>& peakMap)
 {
diff --git a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h
index 8025735ea6c..e6107af1373 100644
--- a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h
+++ b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h
@@ -52,7 +52,7 @@ class GPUTPCCFPeakFinder : public GPUKernelTemplate
   GPUd() static void Thread(int nBlocks, int nThreads, int iBlock, int iThread, GPUSharedMemory& smem, processorType& clusterer, Args... args);
 
  private:
-  static GPUd() void findPeaksImpl(int, int, int, int, GPUSharedMemory&, const Array2D<PackedCharge>&, const uchar*, const ChargePos*, tpccf::SizeT, const GPUSettingsRec&, const TPCCFCalibration&, uchar*, Array2D<uchar>&);
+  static GPUd() void findPeaksImpl(int, int, int, int, GPUSharedMemory&, const Array2D<PackedCharge>&, const uchar*, const ChargePos*, tpccf::SizeT, const GPUSettingsRec&, const TPCPadGainCalib&, uchar*, Array2D<uchar>&);
 
   static GPUd() bool isPeak(GPUSharedMemory&, tpccf::Charge, const ChargePos&, ushort, const Array2D<PackedCharge>&, const GPUSettingsRec&, ChargePos*, PackedCharge*);
 };

@davidrohr davidrohr force-pushed the pad-baseline branch 2 times, most recently from 2547783 to 802ed66 Compare December 2, 2020 15:59
Copy link
Collaborator

@davidrohr davidrohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, apparently I can force-push the rebased branch since it is editable by maintainers :)

@davidrohr davidrohr merged commit 60b41a1 into AliceO2Group:dev Dec 3, 2020
@fweig
Copy link
Contributor Author

fweig commented Dec 4, 2020

Ok, thanks.

Performance is indeed weird. Kernel currently runs at something like 1/3 of the memory bandwidth. I'll see what i can do.

@fweig fweig deleted the pad-baseline branch December 7, 2020 09:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants