A possible memory leak in TTree

Hello,

I’ve noticed that the memory use of the ROOT process increases gradually when I read from a TTree. I am not certain if I am not dealing with some kind of caching, but I attempted to disable caches and the result did not change.

Tested on a TTree from this link: noice_traces_merged_gp13_2024_01_night_datafiles_185-190.root - Google Drive

The first GetEntry() increases the resident memory use by ~53.5 MB, virtual by ~182 MB. Subsequent ~6000 calls to GetEntry(j) gradually increase the resident memory by ~68 MB. The virtual memory first drops slightly, returning to about the same value as after the first GetEntry().

So the memory use of the process increases by ~68 MB while reading ~6600 entries.

This is the code to reproduce:

ProcInfo_t procinfo;
const float toMB = 1.f / 1024.f;

gSystem->GetProcInfo(&procinfo);

string data_file = "noice_traces_merged_gp13_2024_01_night_datafiles_185-190.root";
cout << "Memory before looping: " << procinfo.fMemResident * toMB << "   " << procinfo.fMemVirtual * toMB << endl;

auto f = TFile(data_file.c_str(), "read");
f.SetBufferSize(0);
TTree *t = (TTree*)f.Get("tadc");
//t->SetCacheSize(0);
//t->StopCacheLearningPhase();
for(int j=0; j<t->GetEntries(); ++j)
{
    t->GetEntry(j);
    gSystem->GetProcInfo(&procinfo);
    cout << "Memory after get entry " << j << ": " << procinfo.fMemResident * toMB << "   " << procinfo.fMemVirtual * toMB << endl;
}

ROOT Version: Current master
Platform: Fedora 42


Hello @LeWhoo,

let me add in the loop @pcanal

Cheers,
Monica

Thanks for opening this post! Let’s see what @pcanal says. I just wanted to chime in to mention that this is a spinoff of the following GitHub issue:

Does the memory increase past the current max if you run a second or more times over the tree, i.e:

for(int repeat = 0; repeat < 10; ++repeat) {
  for(int j=0; j<t->GetEntries(); ++j)
  {
      t->GetEntry(j);
      gSystem->GetProcInfo(&procinfo);
      cout << "Memory after get entry " << j << ": " << procinfo.fMemResident * toMB << "   " << procinfo.fMemVirtual * toMB << endl;
  }
}

If it does not then the memory fluctuation is just due to the slight difference in size of the basket and possibly on some memory ‘hoarding’ (holding on to buffer that are slightly too large) to reduce the amount of memory churn (too many memory allocations and de-allocations which both reduce scaling with number of threads and can increase the amount of virtual memory needed (due to memory fragmentation).

There is a difference, but very slight:

What about the virtual size?

After the first iteration constant:

This confirms that there is no leak but rather a ‘slow’ ramped up to the maximum memory usage. To reduce the usage you can reduce the size of the TTreeCache (also making the reading a bit slower) or replace call to TTree::GetEntry to calls to TTree::LoadTree and TBranch::GetEntry for ‘just’ the branch you actually need.

Thanks. I am fine with this use. As the memory use was growing with the numbers of entries read I suspected a memory leak, and was afraid that it will be much more serious with bigger trees, but with your explanation I am calm.

1 Like