TTree::Fill memory leak

noam · November 23, 2010, 6:39pm

Hello rooters,

I have written an analysis framework based almost entirely on root.
With the ATLAS increasing amount of data, I ran into a very strange issue.

When producing my custom tree with ~400 branches, most of type vector*, I observed a very serious memory leak that eats up all the cluster’s memory after few (~15) hours and then the process just stops.

The strange point is that while trying to find this leak, I removed the single tree->Fill() call and the memory leak has completely disappeared.

I am desperate since I tried everything I found in RootTalk and elsewhere without any success,

I’m sure to define the TFile* before the TTree* and I also do tree->SetDirectory(file).
I tried to call SetAutoSave, AutoSave, FlushBaskets, SetMaxVirtualSize, SetBasketSize, SetCacheSize, OptimizeBaskets, individually and with combinations…

Tracking the process using “top”, It seems that the tree is being kept in both the disk (i.e. the TFile where I set tree->SetMaxTreeSize(100Mb) so I have few files filled nicely) and in the virtual memory at the same time for the entire run - shouldn’t the memory be cleaned from time to time ?

After few hours of running it comes to >1Gb and later on, the machine get stuck with my process taking around ~6Gb.

I debugged my code and came down to the conclusion that the only memory leak is caused when I call tree->Fill().
Also, I almost don’t have “new” statements and the ones I do have are followed by proper “delete” statements.

The code is pretty complex to be attached here…
I’m using ROOT 5.26/00.

Please help !

Cheers,
Noam

pcanal · November 23, 2010, 7:58pm

Hi,

[quote]shouldn’t the memory be cleaned from time to time ?[/quote]Absolutely. As far as we know, the only increase in the fact you describe should be around 3 long long per branch per basket (which should hold many entries …).

What is the size of your resulting ROOT file in number of entries and in bytes on file (and if you can do tree->Print(), you should also see the number of baskets)?

[quote]The code is pretty complex to be attached here…[/quote].

If you send me a small version of the resulting ROOT file, I might be able to see if I can reproduce the leak being in TTree::Fill …

Cheers,
Philippe.

PS. “most of type vector*,” : I assume that you made sure that those vector are 'clear’ed between each call to Fill …

noam · November 23, 2010, 9:24pm

Hi Philippe,

For the vectors, I don’t have to clear them since per event I do, e.g. what’s inside the “for” scope:

treeFile = new TFile( "digestTree.root", "RECREATE");
tree = new TTree("digest", "digest");
tree->SetDirectory(treeFile);
tree->SetMaxTreeSize(100000000); // 100Mb per file
tree->OptimizeBaskets();
...
tree->Branch( "mu_staco_px",        &mu_staco_px );
tree->Branch( "mu_staco_py",        &mu_staco_py );
tree->Branch( "mu_staco_pz",        &mu_staco_pz );
...
for(...events...)
{
   mu_staco_px       = m_phys->mu_staco_px;
   mu_staco_py       = m_phys->mu_staco_py;
   mu_staco_pz       = m_phys->mu_staco_pz;
   ...
   tree->Fill();
}
...

where, for example, vector* mu_staco_px, is a branch that belongs to my class that handles the tree operations and where vector* m_phys->mu_staco_px is a branch taken from the input tree.
The input tree(actually it is a chain) belongs to a class named “physics” which I produced by using MakeClass on one of the input dataset files.
So I pass one pointer to another and the “clear” call is not needed (neither the preceding “push_back” call).

I know that this writes to file exactly what I want to write and that by removing the

call inside the “for” scope, I completely remove the leak so I guess that this is safe.

I have uploaded a sample root file with the output tree of a ~3 minutes run:
http://physics.tau.ac.il/noam/digestTree.root
The memory went from ~130Mb to ~730Mb…
The root file is ~50Mb.

I attach also some of the output of tree->Print() in:
http://physics.tau.ac.il/noam/tree_Print.txt

Thanks,
Noam

pcanal · November 23, 2010, 9:49pm

[quote]The input tree(actually it is a chain) belongs to a class named “physics” which I produced by using MakeClass on one of the input dataset files.
So I pass one pointer to another and the “clear” call is not needed (neither the preceding “push_back” call).[/quote]Are you saying that in the result of MakeClass, you have (uncommented) data member that are vectors and you pass those along directly. In this case, did you also remove the SetMakeClass call from the result of MakeClass?

Philippe.

noam · November 23, 2010, 10:00pm

Hi Philippe,

In the result of MakeClass I do have uncommented data members which are of type vector* etc.

I do pass them along directly to my own class’s corresponding data members (which I want to eventually write).

In the header file for “physics” produced from MakeClass, the only reference to SetMakeClass (that I found) is:

// Set branch addresses and branch pointers
if (!tree) return;
fChain = tree;
fCurrent = -1;
fChain->SetMakeClass(1);

should I comment out “fChain->SetMakeClass(1);” ?
If so, why ?

Thanks,
Noam

pcanal · November 23, 2010, 10:11pm

[quote]should I comment out “fChain->SetMakeClass(1);” ?
If so, why ?[/quote]The MakeClass flag is tell the TTree that you want to use the data in non-object mode (only ints, floats, double and array thereof) and you can not really use any objects (including vector). So please try without it (but then again you also have to check that your input file does not contain any object that you want to deconstruct … if it is the case, you will need to use MakeProxy rather than MakeClass).

Philippe.

noam · November 23, 2010, 10:34pm

Hi Philippe,

I tried it without the “fChain->SetMakeClass(1);” call in the “physics” header file and there’s no change.

I didn’t understand what should not work for me if this line is uncommented.
Everything worked “good” (didn’t have problems in using these vectors whatsoever) until I started to use bigger chunks of data and then the leak, which was probably there before, became noticeable.

What can possibly be wrong in “tree->Fill()” if nothing else leaks ?

Thanks,
Noam

brun · November 24, 2010, 7:48am

I do not see any leak when reading your tree. I did the following

root > TFile f("digestTree.root"); root > digest.MakeClass("T");then I instrumented T.C to print the real and virtual memory of the process in the loop of entries (see T.C in attachment), and I ran the following session

root > .L T.C+ root > T t root > t.Loop()

Could you run this session and let us know if you observe a leak?

Rene
T.C (1.59 KB)

noam · November 24, 2010, 10:44am

Hi Rene,

I attached the output of the code you sent.

I didn’t observe a major leak but it may be too small file for that.
However, my problem is not related with reading the input tree but rather writing the output tree.

As I mentioned, everything works beautiful (i.e. no leak but no output tree as well…) when I simply comment out the “tree->Fill()” call.
With this line, however, I see that the output tree is being written correctly to a file (multiple files) but the only issue is the increase in the memory.

Thanks,
Noam
out.txt (2.04 KB)

pcanal · November 24, 2010, 8:59pm

Hi,

I still can not reproduce the leak you mentioned (I created a MakeClass from your file, left the SetMakeClass and did a copy:[code]#include “d01.C”

void run()
{
TFile *_file0 = TFile::Open(“digestTree2.root”); // 8Gb files …
TTree t = digest;
d01 d(t);
TFile f2 = new TFile(“copy.root”,“RECREATE”);
newtree = t->CloneTree(0);
newtree->SetMaxTreeSize(10010241024);
Long64_t nentries = n>=0 ? n : t->GetEntries();
for(Long64_t i = 0; i < nentries; ++i)
{
if ((i%1000)==0) fprintf(stderr,“i=%lld\n”,i);
Long64_t s = t->GetEntry(i);
newtree->Fill();
}
newtree->Write();
d.fChain = 0;
delete f2;
delete _file0;
}[/code]I did this test with both the trunk and the latest patch of the v5.26/00 branches. Both cases have a constant memory profile.

You may still want to try your example again with either the trunk or v5.27/06 as we remove a source of potential memory fragmentation in the meta data of the TTree.

Cheers,
Philippe.

noam · November 24, 2010, 9:05pm

Hi Philippe, Rene,

I think I found a work-around for my memory leak problem.
Triggered by something Philippe said in his 1st reply, I changed my fill scheme from this:

treeFile = new TFile( "digestTree.root", "RECREATE");
tree = new TTree("digest", "digest");
tree->SetDirectory(treeFile);
...
tree->Branch( "mu_staco_px",        &mu_staco_px );
...
for(...events...)
{
   mu_staco_px       = m_phys->mu_staco_px;
   ...
   tree->Fill();
}

to this:

treeFile = new TFile( "digestTree.root", "RECREATE");
tree = new TTree("digest", "digest");
tree->SetDirectory(treeFile);
...
mu_staco_px = new vector<float>; // <==== this is new
tree->Branch( "mu_staco_px",        &mu_staco_px );
...
for(...events...)
{
   int n = (int)m_phys->mu_staco_px->size();
   for(int i=0 ; i<n ; i++)
   {
      mu_staco_px->push_back( m_phys->mu_staco_px->at(i) ); // <==== this is new
      ...
    }
   ...
   tree->Fill();
   mu_staco_px->clear(); // <==== this is new
}

I made 3 memory graphs to prove my point:
The first graph is done using the first fill scheme with the “tree->Fill()” call uncommented:

The second graph is done using the second fill scheme with the “tree->Fill()” call commented out (the result is the same also for the first fill scheme with “tree->Fill()” call commented out…):

The third graph is done using the second fill scheme with the “tree->Fill()” call uncommented:

An obvious leak is seen in the 1st graph so there’s a problem with the 1st fill scheme since there’s no leak if the “tree->Fill()” line is commented out.

In the 3rd graph there’s an overall steady behavior and I can live with the Heaviside-like increase as long as there’ll be no more jumps like this one…

In this example there are ~500k events.

Thanks,
Noam

pcanal · November 24, 2010, 9:11pm

[quote] mu_staco_px->clear(); // <==== this is new[/quote]If this helps, I think the problem is a combination of memory fragmentation (lots of mis-match allocation … this one should be helped by moving to v5.27/06 or higher) and the fact that your data vector never shrunk and whenever your data had an outliner high number of values (for an event) the vector were increased in size.

Cheers,
Philippe.