Memory leak in TTreeReader Next

jpmcgo · March 15, 2024, 10:31am

Hi Root experts,

I have a pretty basic macro that chains together trees from different files and loops over this tree, filling some data from that tree into another “mini-tree”. I’m surprised at how much memory its consuming and running valgrind I see memory leaks from TTreeReader->Next. I’m deleting everything I create with new. I’ve tested using both using TTreeReader->Next() and with the GetEntries method which similarly gives a memory leak.
I’ve attached a macro as well as a valgrind log file. Any help here is much appreciated.

I am running valgrind with a root suppression file like so:

valgrind --leak-check=yes --num-callers=50 --log-file=valgrind9.log --suppressions=$ROOTSYS/etc/valgrind-root.supp ./executable

Best regards,
John
TreeMaker_dev2.cxx (8.2 KB)
valgrind9.txt (1.6 MB)

Danilo · March 15, 2024, 10:33am

Hi,

Thanks a lot for this post and welcome to the ROOT community!
Before diving into this matter, may I ask if you used the ROOT valgrind suppression file?

Best,
Danilo

jpmcgo · March 15, 2024, 10:39am

Hi Danilo,

Thanks in turn for your quick response. Yep, I did use the ROOT valgrind suppression file. I’ve attached it here in case there is some inconsistency.

valgrind-root.supp.txt (33.2 KB)

Best,
John

Danilo · March 15, 2024, 10:40am

Thanks John,

Looking into this.

D

Danilo · March 15, 2024, 12:20pm

Hi John,

If I understand correctly, you are de facto, merging trees: why not using hadd? If some custom operation is needed and this is a simplified reproducer, can dataframe be of help?
Could you slim down your example to something we can run and make the input available, e.g. on eos?

Also, one consideration. If I am not wrong, Valgrind is reporting 39,040 bytes lost, i.e. 39 kB, which is very small to negligible: apologies if that’s the case, but am I missing something?

Best,
D

jpmcgo · March 15, 2024, 1:10pm

Hi Danilo,

I’d like to add in some requirements that slim the trees using composite objects. I don’t think I can do this with hadd. I put an example here: CERNBox along with a couple of data files and files showing the commands to run.

This is indeed a very small amount of memory when running on one file. When I add a couple more, it doesn’t seem to increase (I wouldn’t expect it too based on how I use GetEntries) however when I run over a very large amount of files I run into the batch system’s (condor) memory limit. Maybe the valgrind log is a red herring. Giving another go on the batch system to reproduce the logs and have a closer look.

Best regards,
John

Danilo · March 15, 2024, 1:12pm

Thanks a lot. Let’s see what’s the real problem. So far, those 39kB do not seem to be a possible cause for exhausting the memory of any machine.

Cheers,
D

ferhue · March 15, 2024, 11:31pm

(The small 39kB report looks like the same issue as Valgrind reports leak when constructing TChain · Issue #13130 · root-project/root · GitHub)

Could you rerun it with ROOT 6.30.04 to see if that disappears?

jpmcgo · March 18, 2024, 3:24pm

The batch system held one of my jobs for the following reason: “Error from slot1_8@b9s07p6605.cern.ch: Job has gone over cgroup memory limit of 2000 megabytes. Peak usage: 186562 megabytes. Consider resubmitting with a higher request_memory.”

Of course as you point out the memory leak cannot be the problem here. Can making a very large t-chain to read files cause this sort of memory consumption? I’ve increased the requested memory, but perhaps there is a better solution here.

Best regards,
John

jpmcgo · March 19, 2024, 7:17am

It does not disappear when I run with root 6.390.04.

I think the issue may be that I declared the tree before declaring the file it is written to. I think what then happens is that when the baskets are flushed the tree is stored on RAM instead of to disk. If I monitor with htop I see the memory steadily increase when I run with the tree declared before the file whereas it stays steady if I declare the tree after declaring the file. Does this seem to make sense? I’m rerunning over the full dataset now. Hopefully with this fix I don’t run into the limits of the batch system machines memory.

Best regards,
John

ferhue · March 20, 2024, 9:56am

I meant whether the 39kb leak disappeared with the new version, not the “big one”.

Concerning the TTree before the TFile, take a look at:

system · April 3, 2024, 9:57am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.