Dear RDF fans,

I am building a large dictionary of C++ analysers/algorithms that I can then continently load in ROOT and call from pyROOT to define my RDF graph.

Nevertheless I am pretty sure some part of the code would deserve to be re-written, for example to have a proper class so that the instantiation only happens once.

With this email, and before I start trying too many random things, I would like to know what would be the recommended good practices to chase inefficient code blocks in RDF (if such thing exist of course).


Hi @clementhelsens ,
assuming that your application is not bound by I/O, we can certainly play a bit with performance optimization.

First of all, make sure that the C++ code is compiled with optimizations and debug symbols. If you compile it with ACLiC, you can explicitly ask for optimizations with +Og. If you compile manually, make sure that the -O2 -g compilation flags are passed.

At that point you can profile you application with perf, e.g. with perf record -F99 ./your_app and then check which functions take time with perf report.

In v6.24 you can also activate verbose RDF logging by adding this line at the beginning of the program:

auto verbosity = ROOT::Experimental::RLogScopedVerbosity(ROOT::Detail::RDF::RDFLogChannel(), ROOT::Experimental::ELogLevel::kInfo);

(the Python equivalent should be the same with . instead of :: as usual).
That will give you precise timings of the RDF event loop and of the time RDF spends just-in-time compiling “stuff” – if the latter is too high, there are tricks to reduce it.

Maybe we can start from these and see how it goes. I can also take a look at your application in case it’s possible to set it up and run it on my system.


Thanks @eguiraud , I will give it a try as soon as possible.
Let me write here the procedure to build a debug version for you but also for my own memory.

git clone 
cd  FCCAnalyses

source /cvmfs/

export ROOT_INCLUDE_PATH=$PWD/install/include/FCCAnalyses:$ROOT_INCLUDE_PATH
export LD_LIBRARY_PATH=`python -m awkward.config --libdir`:$LD_LIBRARY_PATH

mkdir build install
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=../install -DCMAKE_BUILD_TYPE=RelWithDebInfo
make install
cd ..

then you can run like:

python examples/FCCee/flavour/Bc2TauNu/ p8_ee_Zbb_Bc2TauNu_stage1.root /eos/experiment/fcc/ee/generation/DelphesEvents/spring2021/IDEA/p8_ee_Zbb_ecm91_EvtGen_Bc2TauNuTAUHADNU/events_003907469.root

to run over one file, or like:

python examples/FCCee/flavour/Bc2TauNu/ p8_ee_Zbb_Bc2TauNu_stage1.root "/eos/experiment/fcc/ee/generation/DelphesEvents/spring2021/IDEA/p8_ee_Zbb_ecm91_EvtGen_Bc2TauNuTAUHADNU/events_03*.root"

to run over more files. I granted you read access to /eos/experiment/fcc/ee/

With this key4hep debug setup, we already have root v6.24, so I could try you nice suggestion.


Alright, I’ll put in on my to-do list to run these two workloads and check if I see something weird.

Note that for performance tests you want -DCMAKE_BUILD_TYPE=RelWithDebInfo instead of Debug.

Hi @clementhelsens ,
did you manage to find out what needs optimizing exactly (if anything)?

I am trying to set this up on my workstation but I don’t manage to load from CVMFS, can you suggest what configuration I need?


sorry @eguiraud , I did not found time to further investigate this. From any lxplus machine log in I can see this cvmfs repo mounted. I guess your workstation is within EP-STF ? if so maybe the system administrator could have a look and mount it? Last week on some machines to mount got stuck and IT had to do it again, maybe same thing is happing for you?

