RDataFrame performance for branch reduction and event filtering (slimming and skimming)

eguiraud · November 24, 2022, 3:50pm

/eos goes through the network-mounted filesystem, root:// goes through the xrootd protocol. The latter is specialized for reading ROOT files and performs less calls and just for the necessary bytes. Spread the word! Future ROOT versions will actually switch to root:// under the hood by default, but not v6.26. For now you can check how much difference it makes using root-readspeed (which will be included in ROOT v6.28, by the way).

About your questions, it will be hard to answer to 2.a without doing a comparison study in similar conditions to those in which you actually run your code, but I’ll see whether I can figure out what’s going on for 1. and 2.b. It could just be that things are working as intended, RDF is known to have a certain overhead compared to root-readspeed (the price of nice features) – but it could also be that there are some easy performance gains possible.

Cheers,
Enrico