Hi @rmadar ,
this is super useful feedback.
TDataFrame in v6.12 was the first prototype, it’s now obsolete and as you see it was also much slower, so I would focus on the numbers you produced for v6.14, C++ and python.
A few questions before we dig deeper in the performance measurements:
- RDataFrame parallelizes over TTree clusters. How many clusters does your dataset have? You can check with
tree->Print("clusters")
or with the method described here - from your code it seems that the explicit event loop is doing less work, is it an apple to apple comparison?
- could you produce timings for code compiled with optimizations? (
g++ -O3
) - what are the timings for RDataFrame without MT?
- how hard would it be to try the same with ROOT master branch?
Regarding the instability of Count()
: it looks like a bug, would it be possible to have a standalone reproducer that we can debug?
Many thanks for the super interesting feedback again!
Cheers,
Enrico