Drawing histogram using RDataFrame without loading eveyrthing into RAM

Hi, it seems that ROOT will load all content into memory when drawing using RDataFrame::Histo1D(). And thus not able to handle larger-than-memory datasets.

For example, this will use up all my RAM and cause an OOM. (large.root is ~60GB and I have 32G of RAM)

rdf = ROOT::RDataFrame("large_tree", "large.root");
h1 = rdf.Histo1D("eV");

But using the TFile/TTree interface won’t.

f = new TFile("large.root");
t = f.Get<TTree>("large_tree");

Is there a way to achieve the performance of the old interface while using RDataFrame?

ROOT Version: 6.18/02
Platform: Linux x64
Compiler: GCC

ROOT does not load all data into memory when using RDataFrame (internally, RDF uses TTreeReader, which is based on TTree, so fundamentally the I/O technology is the same). There must be something else going on here.

Can you share a minimal reproducer?

EDIT: or maybe, especially if your ROOT has debug symbols, you could run your reproducer with valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --tool=massif to profile who is allocating so much.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.