Limiting reading buffer for TChain/TTree

Hi,

I have about 20 root files, each containing a TTree. Adding all file sizes together results in approximately 1GB. In my program I want to loop over all trees, so I use a TChain to add all the trees together.

The problem is that my computer only has 512MB of memory. When looping over the TChain I have the impression that root tries to load each file in one go into memory. The result is that my computer swaps like crazy for several minutes before the next file is opened and processed.

I am wondering if there is any mechanism to tell root NOT to try to read in the whole tree at once? I tried to set TChain::SetMaxVirtualSize() but this has no effect.(?)

Thanks for any help, Nils.

P.S. I am using root version 4.03/02 on RHEL3 (linux).

Processing 20 root files should not take more space than processing
one file. My guess is that you have a memory leak somewhere in your program. It could be that you do not delete some of the collections
in your entries.

Rene

Hi Rene,

thanks for your reply. Unfortunately it is not a memory leak.

I have the problem even when opening only one file which is 50 MB on disk. It contains a TTree (only doubles and ints) which is named tMU. When I open the file interactively and then type

root [0] tMU <return>

the memory consumption of root goes up from ~30MB to ~180MB.

What I found suspicious is when I do

root [3] tMU.Print()       

I get

...
*Br   58 :set       : set/D                                                 *
*Entries :   401821 : Total  Size=    3231608 bytes  All baskets in memory   *
*Baskets :      100 : Basket Size=      32000 bytes  Compression=   1.00     *[/code]

I guess the “All baskets in memory” statement is suspicious. When creating the TTree (on a more powerful computer than mine) no file is open, so the whole tree has to reside in memory until at the end of the job a TFile is opened and the TTree is saved. Is this “all in memory” behaviour saved somehow in the TTree? And is there a way to tell root to not try to read the whole tree to memory first when reading the tree from the file?
Or is this a behaviour which is fixed at write time, i.e. I would have to open a file and create the TTree in the file’s directory before filling the tree?

As a cross check I opened another file (~2GB) which contains a TTree which instead of “All baskets in memory” tells me “File Size = …” and there I have no problem running over the TTree. The memory consumption is very low and there is no swapping.

Cheers, Nils.
[/code]

Hi,

[quote=“gollub”]Hi Rene,

I guess the “All baskets in memory” statement is suspicious. When creating the TTree (on a more powerful computer than mine) no file is open, so the whole tree has to reside in memory until at the end of the job a TFile is opened and the TTree is saved. Is this “all in memory” behaviour saved somehow in the TTree? And is there a way to tell root to not try to read the whole tree to memory first when reading the tree from the file?
Or is this a behaviour which is fixed at write time, i.e. I would have to open a file and create the TTree in the file’s directory before filling the tree?
[/code][/quote]

I’m having a similar problem. Was this question ever answered? Did you find a workaround?

mike

When tree.Print() reports branches with all baskets in memory, it means
that you have created a memory-resident Tree instead of a file-resident Tree.

A memory-resident Tree is created when the current directory is
the memory directory (no file open).

A file/disk-resident Tree is created with

TFile *f = new TFile("myfile.root","recreate"); TTree *T = new TTree("T","tt);
So the typical mistake is to do instead

TTree *T = new TTree("T","tt); TFile *f = new TFile("myfile.root","recreate");

Rene

[quote=“brun”]

A file/disk-resident Tree is created with

TFile *f = new TFile("myfile.root","recreate"); TTree *T = new TTree("T","tt);
So the typical mistake is to do instead

TTree *T = new TTree("T","tt); TFile *f = new TFile("myfile.root","recreate");

Rene[/quote]

I think that I’m doing the former rather than the latter. There are a few intervening lines, where I suppose gDirectory might get modified. I can check it, surely.

However, is there any workaround, perhaps involving reading and rewriting the file? I can probably do this on a system with enough memory (not my laptop!).

To be completely specific, I am experiencing large memory usage when using TTree::Draw() on a single variable in the tree. I’ve noticed, via top, that root’s memory usage actually spikes by 150M when I do

TTree* t = (TTree*) file->Get(“mytree”);

and then it increases by another 100M when the draw command is excecuted.

My example ntuple is at:
www.hep.ucl.ac.uk/~kordosky/pan_n143010.root

Thanks so much for the help!!

Mike,

I looked into your file. Your Tree is effectively a memory-resident Tree.
You should fix the problem when generating the Tree

Rene