I think there is a problem with the memory leak, or with the way ROOT reads large TTree.
I have a large ROOT file with a TTree (20 GB), with only 5 branches. I’m reading this tree with pyROOT. What I see is that it fills RAM. After ~1 day, it uses 100% of the memory (36 GB), and it stops. Here is an example:
rfile=ROOT.TFile.Open(“file.root”)
mytree=rfile.inputNN # get this tree
NtotInFile=mytree.GetEntries()
for jentry in range(NtotInFile):
… mytree.GetEntry(jentry)
… inx1=mytree.index # get an array with numbers
… pass
any ideas about how to mange the memory consumption, if do not want to split this large ROOT file?
best, Sergei
_ROOT Version: 6-26-04
_Platform: Linux
_Compiler: gcc
Thanks for the post. Is there a way in which we can reproduce this issue?
I would advise also to move to a more recent ROOT version, e.g. 6.34: the Python Interface is one of the ROOT components that evolved more radically in the past two years.
I’m not sure where to put this large ROOT file. But I can confirm that I see the same problem using ROOT Version: 6.32.02 from LCG106. The data are being accumulated in the RAM.
Also, a bit more info. The events have “<class cppyy.gbl.std.vector”
So it looks like python cannot clear this vector after reading it. Maybe pyROOT does not know how to release resources allocated by std vector, after reading an event.
Below is a zip to reproduce the issue if you are interested. You will find a debug.py and debug.root file, run: python debug.py under ROOT 6.26/04 (other versions probably would give the same issue according to Sergei).
The issue is when reading contents in a vector of double.
There is this line event.proj[0] # This line causes a memory leak. When it is skipped, the memory taken is about 130MB. When it is executed as in the code, the memory consumption increases about 50MB per second or so.