Tcahin, getentries, memory

Hi,

I have a code which reads trees from a bunch of files, creates tchain and tries to analize them …
the problem is the following:

I test the memory usage of the code.
When it arrives to the line “chain->GetEntries()” it uses 0.6% of the memory of the machine.
Than it counts and after ~ 1 minute (I am not surprised about the counting time …) the memory usage is ~30% …

is it normal?

-reading the tree

  sleep(10);//this gives me time to check the memory usage with top
  TFile* file = NULL;
  TDirectory* TD = NULL;
  TChain* DATA_TREE = NULL;

.
.
.
  if(!(Ifilen == "") && ifilen == ""){
    TString S = tpath;
    if(treen == ""){
      S = S + "DATA_TREE";
    }else{
      S = S + "/" + treen;
    }

    DATA_TREE = new TChain(S.Data());
    ifstream ifile(Ifilen);
    if(ifile){
      char C[128];
      while(1){
	ifile >> C;
	if(!ifile.good()) break;
	cout << C << endl;
	DATA_TREE->Add(C);
      }
    }
  }
  sleep(10);
.
.
.

… and here the mem usage is 0.6% …
and the counting …

.
.
.
  sleep(10);
  Long64_t events = chain->GetEntries();
  sleep(10);
.
.
.

and the mem-usage is ~30%

root 5.18/00, linux …

Thanx for the help …
k.

Hi,

In order to return an accurate number, TChain::GetEntries must open every file, load the TTree object from it and extract the number of entries in the file. Hence the 1 minutes. At the end of TChain::GetEntries, the TChain continues to ‘point to’ the last file and still has in memory the TTree object of the last file. Until v5.20, the TTree object also contained the last basket for every single branch (i.e. by default 32k per branch).

Cheers,
Philippe.

[quote=“pcanal”]Hi,

In order to return an accurate number, TChain::GetEntries must open every file, load the TTree object from it and extract the number of entries in the file. Hence the 1 minutes. At the end of TChain::GetEntries, the TChain continues to ‘point to’ the last file and still has in memory the TTree object of the last file. Until v5.20, the TTree object also contained the last basket for every single branch (i.e. by default 32k per branch).

Cheers,
Philippe.[/quote]

Do I understang well?:
during the “counting” the trees from every file will be read into (and stored in) the memory and after the TChain::GetEntries at least the last read file will stay in the memory …
OK, but where are the trees from the other files (not from the last one)?
Do they stay in memory? (I guess so because during the GetEntries I will occupy more and more memory which wont be freed after the GetEntries.
Is there any way not to occupy that much memory after counting?

Thanx for your answer …

k.

[quote]Do they stay in memory?[/quote]No. They do not stay in memory. There is only one TTree object in memory at time.

[quote] (I guess so because during the GetEntries I will occupy more and more memory which wont be freed after the GetEntries.
Is there any way not to occupy that much memory after counting?
[/quote]However until v5.20 there was a possible leak of one user object per top level branch per file (however I think it requires to call GetEntry).

What do you use to measure whether the memory is free up or not?

Cheers,
Philippe.

Thanx for your answer.

I make my code sleep for several seconds before and after the GetEntries call and I can check the memory usage of the code with “top” …
This is not necessarily the most proper way to do, but I can see that the memory usage of the code is small (I have no numbers here) before calling the GetEntries, grows during the GetEntries, and will occupy ~2GB of memory after the GetEntries …

I can try to deattach the reading of the chain from our framework and send the code itself …

Thanx, again,
k.

Hi,

Yes 2Gb sounds excessive, so please send me a way to reproduce this issue.

Cheers,
Philippe.

Note that when calling chain.GetEntries(), only the TTree header of each file is imported in memory, not the TTree data! , unless you have created a memory resident Tree.

Rene