-
Notifications
You must be signed in to change notification settings - Fork 486
TPCClusterFinder: Add optional filter for noisy pads. #4850
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hm, good question. Did you check what is the performance overhead? In principle one could consider checking only the first segment of a timeframe, and then apply the result to the full time frame. That would probably be quite stable, I guess the behavior does not change that fast. Perhaps you can also make it configurable to check all fragments or only the first one. For the parameters, I think we have to normalize them. Ideally they should be independent from the segment length. I'd rather define them as maxTimeBinAboveThresholdIn1000Bins and then scale with segment size, or something along that. |
|
@fweig : Thx, now it compiles and should run correctly. Do you have an idea when you can address my other comment? |
|
@davidrohr I'm on it. Hope to push the update today. |
|
Pushed another update:
For performance overhead I get about 30000 - 40000us per fragment across all sectors on the RTX 2080 ti. |
c7ab6bf to
24762bb
Compare
davidrohr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks all OK, I have to say I am a bit puzzled by the time overhead. I had expected it to be less. Perhaps at some time one could double-check if this can be improved, but for not it is OK when I enable the quickcheck option.
There is zero overhead when the filter is disabled, so that looks good.
Could you rebase and apply defaults of 200/500 and enable the quick check (or just apply this patch):
diff --git a/GPU/GPUTracking/Base/GPUSettingsList.h b/GPU/GPUTracking/Base/GPUSettingsList.h
index 4af33abefac..6b385d6d97c 100644
--- a/GPU/GPUTracking/Base/GPUSettingsList.h
+++ b/GPU/GPUTracking/Base/GPUSettingsList.h
@@ -50,9 +50,9 @@ AddOptionRTC(tpcTubeMaxSize2, float, 2.5f * 2.5f, "", 0, "Square of max tube siz
AddOptionRTC(trdMinTrackPt, float, .5f, "", 0, "Min Pt for tracks to be propagated through the TRD")
AddOptionRTC(trdMaxChi2, float, 15.f, "", 0, "Max chi2 for TRD tracklets to be matched to a track")
AddOptionRTC(trdPenaltyChi2, float, 12.f, "", 0, "Chi2 penalty for no available TRD tracklet (effective chi2 cut value)")
-AddOptionRTC(noisyPadsQuickCheck, unsigned char, 0, "", 0, "Only check first fragment for noisy pads instead of all fragments (when test is enabled).")
-AddOptionRTC(maxTimeBinAboveThresholdIn1000Bin, unsigned short, 0, "", 0, "Except pad from cluster finding if total number of charges in a fragment is above this baseline (disable = 0)")
-AddOptionRTC(maxConsecTimeBinAboveThreshold, unsigned short, 0, "", 0, "Except pad from cluster finding if number of consecutive charges in a fragment is above this baseline (disable = 0)")
+AddOptionRTC(noisyPadsQuickCheck, unsigned char, 1, "", 0, "Only check first fragment for noisy pads instead of all fragments (when test is enabled).")
+AddOptionRTC(maxTimeBinAboveThresholdIn1000Bin, unsigned short, 500, "", 0, "Except pad from cluster finding if total number of charges in a fragment is above this baseline (disable = 0)")
+AddOptionRTC(maxConsecTimeBinAboveThreshold, unsigned short, 200, "", 0, "Except pad from cluster finding if number of consecutive charges in a fragment is above this baseline (disable = 0)")
AddOptionRTC(tpcCFqmaxCutoff, unsigned char, 3, "", 0, "Cluster Finder rejects cluster with qmax below this threshold")
AddOptionRTC(tpcCFqtotCutoff, unsigned char, 0, "", 0, "Cluster Finder rejects cluster with qtot below this threshold")
AddOptionRTC(tpcCFinnerThreshold, unsigned char, 0, "", 0, "Cluster Finder extends cluster if inner charge above this threshold")
diff --git a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx
index b7129626ea3..e812f5e00fb 100644
--- a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx
+++ b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.cxx
@@ -26,7 +26,7 @@ GPUdii() void GPUTPCCFPeakFinder::Thread<0>(int nBlocks, int nThreads, int iBloc
{
Array2D<PackedCharge> chargeMap(reinterpret_cast<PackedCharge*>(clusterer.mPchargeMap));
Array2D<uchar> isPeakMap(clusterer.mPpeakMap);
- findPeaksImpl(get_num_groups(0), get_local_size(0), get_group_id(0), get_local_id(0), smem, chargeMap, clusterer.mPpadHasLostBaseline, clusterer.mPpositions, clusterer.mPmemory->counters.nPositions, clusterer.Param().rec, *clusterer.GetConstantMem()->calibObjects.tpcCalibration, clusterer.mPisPeak, isPeakMap);
+ findPeaksImpl(get_num_groups(0), get_local_size(0), get_group_id(0), get_local_id(0), smem, chargeMap, clusterer.mPpadHasLostBaseline, clusterer.mPpositions, clusterer.mPmemory->counters.nPositions, clusterer.Param().rec, *clusterer.GetConstantMem()->calibObjects.tpcPadGain, clusterer.mPisPeak, isPeakMap);
}
GPUdii() bool GPUTPCCFPeakFinder::isPeak(
@@ -95,7 +95,7 @@ GPUd() void GPUTPCCFPeakFinder::findPeaksImpl(int nBlocks, int nThreads, int iBl
const ChargePos* positions,
SizeT digitnum,
const GPUSettingsRec& calib,
- const TPCCFCalibration& gainCorrection, // Only used for globalPad() function
+ const TPCPadGainCalib& gainCorrection, // Only used for globalPad() function
uchar* isPeakPredicate,
Array2D<uchar>& peakMap)
{
diff --git a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h
index 8025735ea6c..e6107af1373 100644
--- a/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h
+++ b/GPU/GPUTracking/TPCClusterFinder/GPUTPCCFPeakFinder.h
@@ -52,7 +52,7 @@ class GPUTPCCFPeakFinder : public GPUKernelTemplate
GPUd() static void Thread(int nBlocks, int nThreads, int iBlock, int iThread, GPUSharedMemory& smem, processorType& clusterer, Args... args);
private:
- static GPUd() void findPeaksImpl(int, int, int, int, GPUSharedMemory&, const Array2D<PackedCharge>&, const uchar*, const ChargePos*, tpccf::SizeT, const GPUSettingsRec&, const TPCCFCalibration&, uchar*, Array2D<uchar>&);
+ static GPUd() void findPeaksImpl(int, int, int, int, GPUSharedMemory&, const Array2D<PackedCharge>&, const uchar*, const ChargePos*, tpccf::SizeT, const GPUSettingsRec&, const TPCPadGainCalib&, uchar*, Array2D<uchar>&);
static GPUd() bool isPeak(GPUSharedMemory&, tpccf::Charge, const ChargePos&, ushort, const Array2D<PackedCharge>&, const GPUSettingsRec&, ChargePos*, PackedCharge*);
};
2547783 to
802ed66
Compare
davidrohr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, apparently I can force-push the rebased branch since it is editable by maintainers :)
|
Ok, thanks. Performance is indeed weird. Kernel currently runs at something like 1/3 of the memory bandwidth. I'll see what i can do. |
This PR adds an optional pad filter to the tpc cluster finder and two flags for filter pads based on number of total charges and number of consecutive charges.
@davidrohr This currently checks every pad in every fragment. Thats probably not necessary?