RDataFrame performance for branch reduction and event filtering (slimming and skimming)

eguiraud · November 28, 2022, 2:33pm

Mmmh you are dominated by noise and/or other latencies. root:// should consistently demonstrate the smallest minimum runtimes though.

@pcanal might be able to point you to some docs. The available algorithms are kZLIB, kLZMA, kLZ4, kZSTD. My understanding is the following: LZ4 is faster but compresses less well, LZMA is slower but compresses more, ZLIB and ZSTD are a bit of a middle ground but ZSTD is newer and typically closer to LZ4 speed while still having good enough compression.

Just a wild guess, but if the files are very small, the overhead of opening and closing the file might become important w.r.t. the actual event loop time. I’m happy to take a look though, “runtime does not scale linearly with number of events” is something that we should have a good explanation for Certainly the expectation is that it is!

Cheers,
Enrico