Dear ROOT experts,
I’m trying to perform the following operations of the ntuples of my analysis:
- Select subset of branches (i.e slimming)
- Modify the weights to include the normalization coefficients
- Perform event selection (i.e. skimming)
I’ve previously had problems processing multiple small files using RDataFrame, but now I’ve noticed some strange performance on the big files as well and want to ask some questions about it.
The scripts I’m using can be found in this cernbox folder.
CreateInput.py
creates a 4.2 GB file with 359 branches similar to that I have in the real analysis. The file itself is also included in that folder.
RDFBenchmarkDummy.py
checks the time to run the analysis code in 3 ways:
- With just slimming
- With slimming and weight modification.
- With slimming, weight modification and event selection set up to select ~5% of the events.
The time is averaged over 3 tries.
I’m getting the following results on my machine
(1)
[42.19031023979187, 40.98794865608215, 38.83463215827942]
Snapshot: 40.67+-1.70 s
(2)
[45.55086946487427, 48.206947326660156, 47.1492977142334]
Snapshot: 46.97+-1.34 s
(3)
[9.084217071533203, 8.745355606079102, 8.877978801727295]
Snapshot: 8.90+-0.17 s
I’ve compared the results to the output of the root-readspeed run with the following command that accesses the same branches used in the RDFBenchmarkDummy.py
:
./root-readspeed --trees tree_3lCR_PFLOW --files Heavy_d.root --branches leading_pT_lepton subleading_pT_lepton n_bjets n_jets event_3CR event_type met_tst met_signif dMetZPhi MetOHT dLepR Z_pT M2Lep mT_ZZ Z_rapidity frac_pT sumpT_vector sumpT_scalar ZpTomT weight weight_EL_EFF_ID_CorrUncertaintyNP0__1down weight_EL_EFF_ID_CorrUncertaintyNP0__1up weight_EL_EFF_ID_CorrUncertaintyNP10__1down weight_EL_EFF_ID_CorrUncertaintyNP10__1up weight_EL_EFF_ID_CorrUncertaintyNP11__1down weight_EL_EFF_ID_CorrUncertaintyNP11__1up weight_EL_EFF_ID_CorrUncertaintyNP12__1down weight_EL_EFF_ID_CorrUncertaintyNP12__1up weight_EL_EFF_ID_CorrUncertaintyNP13__1down weight_EL_EFF_ID_CorrUncertaintyNP13__1up weight_EL_EFF_ID_CorrUncertaintyNP14__1down weight_EL_EFF_ID_CorrUncertaintyNP14__1up weight_EL_EFF_ID_CorrUncertaintyNP15__1down weight_EL_EFF_ID_CorrUncertaintyNP15__1up weight_EL_EFF_ID_CorrUncertaintyNP1__1down weight_EL_EFF_ID_CorrUncertaintyNP1__1up weight_EL_EFF_ID_CorrUncertaintyNP2__1down weight_EL_EFF_ID_CorrUncertaintyNP2__1up weight_EL_EFF_ID_CorrUncertaintyNP3__1down weight_EL_EFF_ID_CorrUncertaintyNP3__1up weight_EL_EFF_ID_CorrUncertaintyNP4__1down weight_EL_EFF_ID_CorrUncertaintyNP4__1up weight_EL_EFF_ID_CorrUncertaintyNP5__1down weight_EL_EFF_ID_CorrUncertaintyNP5__1up weight_EL_EFF_ID_CorrUncertaintyNP6__1down weight_EL_EFF_ID_CorrUncertaintyNP6__1up weight_EL_EFF_ID_CorrUncertaintyNP7__1down weight_EL_EFF_ID_CorrUncertaintyNP7__1up weight_EL_EFF_ID_CorrUncertaintyNP8__1down weight_EL_EFF_ID_CorrUncertaintyNP8__1up weight_EL_EFF_ID_CorrUncertaintyNP9__1down weight_EL_EFF_ID_CorrUncertaintyNP9__1up
And have got the following output, once again after 3 tries:
1 thread(s), real time: 6.46+-0.19
I’ve got the following questions:
- Why does the simple slimming takes 6 times more real time than the
root-readspeed
estimate? I would expect it to be 2.5-3 times more (one time to read, one time to write plus some more time for the graph compilation) but not that much - While this reproducer is only showing difference of 1.36 with the
root-readspeed
I’m actually getting the difference of 1.7-2 times with the real datafiles. Since the data is read only once (i.e. no taking advantage of caching) and stored on the remote filesystem (eos
) due to it size, the performance is bottlenecked with the read/write speed, so even the factor of 2 takes makes the conversion process very long.
(a) What could be the difference with the real files compared to this? The compression is actually better for the real files.
(b) If the slimming is sped up is it possible to speed up the final (3) version of the code?
Thanks in advance,
Aleksandr
ROOT Version: 6.26/10
Platform: Ubuntu 18.04 on local machine with i5-8250U CPU and SSD drive
Compiler: prebuilt binary