Problems when handling 25000 trees

Dear ROOTers

In order to better understand the problem, please allow me first to explain the algorithm to be used,
called “normalize quantile”. Assuming a table with n rows and m columns the algorithm is as follows:
1, for each column determine the rank of each row
2, sort each column according to its rank
3, for each row compute the mean of the sorted columns
4, replace each column with the common mean
5, sort each column individually according ot its original rank

Attached I have macro “macroQuantileTest.C” and a demo table “TestAB.txt” which implements the quanitle
normalization identical to the implementation in my program with the only difference that in my program
every column is replaced by a root tree. (The orignal implementation of the algorithm in C is using a
linearized matrix and thus has severe memory problems when applied to large datasets.)

Until now the code worked very well even for large datasets but now a user of my program has imported
25000 data as root trees into a root file, where each tree has more than 1.2 million entries (the root
file has a size of about 140 GB).

Now the user wanted to apply the “quantile normalization” algorithm to all 25000 trees and experienced
a progressive increase in memory consumption and in processing time: after about 1hr about 2000 samples
were processed using about 1.4 GB; after 22hrs only 17000 samples were proceesed and memory increased
to about 5GB so the user stopped the processing.

The user has compiled root_v5.24.00 for 64 bit on MacOS X 10.4 and reported that most of the time was
spent in vm_map_enter and vm_map_lookup_entry of the mach_kernel library (although there was no apparent swap).

The problem may already occur during filling datatrees which I do as follows:

Int_t Normalize(Int_t numdata, TTree **datatree)
{
   // sort arrays for quantile normalization
   for (Int_t k=0; k<numdata; k++) {
      FillDataArrays(datatree[k], size, arrInten);
      AddArray(size, arrInten, arrMask, datatree[k]->GetName());
   }//for_k

   // quantile normalization
   Calculate(size, arrInten, arrInten, arrMask);

   // get normalized arrays and fill data trees
   for (Int_t k=0; k<numdata; k++) {
      arrInten = GetArray(size, arrInten, arrMask, datatree[k]->GetName());
      datatree[k] = FillDataTree(datatree[k], size, arrInten);
   }//for_k
}//Normalize

Here numdata is 25000 trees and size is 1.2 million entries. Functions AddArray(), Calculate() and GetArray()
are described in the attached macro and the other two functions are in principle as follows:

Int_t FillDataArrays(TTree *datatree, Int_t size, Double_t *inten)
{
   XGCCell *gccell = 0;
   datatree->SetBranchAddress("DataBranch", &gccell);

   for (Int_t i=0; i<size; i++) {
      inten[i] = gccell->GetIntensity();
   }//for_i

   SafeDelete(gccell);
   datatree->ResetBranchAddress(datatree->GetBranch("DataBranch"));
}//FillDataArrays

TTree *FillDataTree(TTree *oldtree, Int_t size, Double_t *inten)
{
   XGCCell *oldcell = 0;
   oldtree->SetBranchAddress("DataBranch", &oldcell);

   XGCCell *newcell = new XGCCell();
   TTree *newtree = new TTree(oldtree->GetName());
   newtree->Branch("DataBranch", "XGCCell", &newcell, 64000, 99);

   for (Int_t i=0; i<size; i++) {
      oldtree->GetEntry(i);
      newcell->SetIntensity(inten[i]);
      newtree->Fill();
   }//for_i
   newtree->Write("", TObject::kOverwrite);

   SafeDelete(newcell);
   newtree->ResetBranchAddress(newtree->GetBranch("DataBranch"));
   SafeDelete(oldcell);
   oldtree->ResetBranchAddress(oldtree->GetBranch("DataBranch"));
   SafeDelete(oldtree);

   return newtree;
}//FillDataTree

Moreover, I assume that the problem may also be in the following code fragment of function Calculate():

// Calculate mean values of all tree entries
   for (Int_t i=0; i<fNMean1; i++) {
      // read entry i from trees and fill array
      for (Int_t k=0; k<fNData; k++) {
         treek[k]->GetEntry(i);
         arr[k] = sortk[k];
      }//for_k

      // calculate mean of array
      fMean1[i] = Mean(fNData, arr);
      y[i]      = fMean1[i];
   }//for_i

Here fNData is 25000 trees and fNMean1 is 1.2 million entries.

Do you know what might be the reason for the progressive time and memory consumption?
Is there a way to avoid the progressive memory increase (since my intention is that users should be able
to use my program on computers with 1-2 GB RAM only)?
What is the behavior of the baskets in this case?

Best regards
Christian
macroQuantileTest.C (8.82 KB)
TestAB.txt (4.02 KB)

Hi Christian,

A few observations :slight_smile:.

It seems that you require all the input to be in memory at the same time (Int_t Normalize(Int_t numdata, TTree *datatree)), so the memory will increase (by the size of the TTree meta date) with the number of trees, you may want to consider passing around TFile or filenames and load the input TTree only as needed.

Your function FillDataArrays seems to be missing a datatree->GetEntry(i) (or a branch->GetEntry(i)).

The name of the function AddArray suggested that the memory would increase by the complete size of the array each time (10Mb for each columns)?

Cheers,
Philippe.

Dear Philippe,

Thank you for your comments.

ad 2, FillDataArrays():
Functions FillDataArrays() and FillDataTree() are only code fragments, so you are right datatree->GetEntry(i) is missing.

ad 3, AddArray():
Function AddArray(), described in macro “macroQuantileTest.C”, uses only the current array “arrInten”, which is obtained
from datatree[k] in function FillDataArrays() to sort the array and fill a tree “rnktree” with the sorted array and the
rank, so the memory should not increase, since “arrInten” is used as temporary array only.

ad 1, Normalize():
Only the pointers to all 23000 trees must be in memory. Is this considered to be a problem?
What do you mean with “memory will increase (by the size of the TTree meta data) with the number of trees”?

Best regards
Christian

In QuantileTest if(fMean1) {delete fMean1; fMean1 = 0;}should read if(fMean1) {delete [] fMean1; fMean1 = 0;}.

[quote]Only the pointers to all 23000 trees must be in memory[/quote]I am not sure what you mean, if the pointer are not null (and I guessed they are not since I see not init of their initialization in your snippets of code) then they point to the TTree object, especially after reading through them once, the TTree object will also hold one buffer per branch (32k). If you have a 10 branches per TTree, each TTree takes 320k and thus 720Mb. This number would increase if you increase the number of trees. However in your case, this is only a ‘base’ cost and does not explain the memory increase over time. (however the above mention delete mismatch might).

Cheers,
Philippe.

Dear Philippe,

Once again my mistake in the demo code, however fMean1 is only created and deleted once, so it does not increase memory.

At the moment I do not understand your calculation:
For 10 branches per TTree and 32k buffer you get 320k per TTree, but how do you get 720MB per TTree?

Best regards
Christian

Hi,

I am assuming/guessing that all the TTree object are in memory at the same time (because I can’t spot the code that would prevent it). So 7Gb (I was off a factor 10) would be the cost of holding all 23000 TTrees object in memory (32K2300010)

So actually either you have much less than 10 branches per TTree or your basket size is less than 32k … Or the file have been written with ROOT 5.22 or newer … in which case the buffer are not loaded in memory when you first retrieve the TTree from the TFile but instead are allocated once you first read an entry from the TTree (and/or branch).

If I am right on this last supposition, you would be running out of space during the execution of FillDataArrays. In that case, you can reduce the memory footprint by using branch->GetEntry rather than tree->GetEntry and by calling tree->DropBaskets() at the end of the function (which should free the memory held by the baskets).

Cheers,
Philippe.

Dear Philippe,

I see. Since my macro contains mostly the actual code of my program you can see that the tree contains two branches,
which means that I would need 32k225000 = 1.6 GB memory. Since I need only one branch, I can call branch->GetEntry()
as you suggested, reducing memory consumption to about 800 MB.

Yes, the files were written with ROOT 5.24, nevertheless it is not quite clear to me why I will be running out of
space during the execution of FillDataArrays().

First, I thought that baskets will be automatically deleted from memory when memory is filled and more baskets are
read into memory. This seems not to be the case?

Second, I thought that calling datatree->ResetBranchAddress() will also free the memory occupied by the baskets.
Does this mean that whenever I call datatree->ResetBranchAddress() I should first call datatree->DropBaskets()
to free memory?

Since both functions, FillDataArrays() and FillDataTree(), are looping over all trees this means I should call
tree->DropBaskets() at the end of both functions. Is this correct?

How should I handle the other case:

// Calculate mean values of all tree entries
   for (Int_t i=0; i<fNMean1; i++) {
      // read entry i from trees and fill array
      for (Int_t k=0; k<fNData; k++) {
         treek[k]->GetEntry(i);
         arr[k] = sortk[k];
      }//for_k

      // calculate mean of array
      fMean1[i] = Mean(fNData, arr);
      y[i]      = fMean1[i];
   }//for_i

Should I loop over tree[k]->DropBaskets()?

Best regards
Christian

[quote]First, I thought that baskets will be automatically deleted from memory when memory is filled and more baskets are
read into memory. This seems not to be the case? [/quote]The limit is checked only within a single TTree and even in that we do keep one basket per branch.

Cheers,
Philippe

[quote]Second, I thought that calling datatree->ResetBranchAddress() will also free the memory occupied by the baskets.
Does this mean that whenever I call datatree->ResetBranchAddress() I should first call datatree->DropBaskets()
to free memory? [/quote]Well not necessarily at the same time but yes to cleanup all memory use by the TTree (short of deleting the TTree) you need both. The ResetBranchAddress concern the user object itself (i.e. in useable form for you) while the DropBasket would remove the loaded-in-memory on-file-representation of several entries of the branch.

Cheers,
Philippe.

Dear Philippe,

Thank you very much, I followed your advice.
I will see what the user of my program will report.
I am sure to have more questions soon.

Best regards
Christian

Dear Philippe,

Thank you once again for your help.
The user has just reported that the memory stays now below the 1 GB mark which is great.

Best regards
Christian

Dear Philippe,

It seems that I have another method where there may be a memory problem, this time during writing of 25000 trees.
Here is the relevant code fragment:

DoMultichipExpress(Int_t numdata, TTree **datatree)
{
// Create new trees exprtree
   exprtree = new TTree*[numdata];
   expr     = new XGCExpression*[numdata];

   for (id=0; id<numunits; id++) { 
      DoSomething();
      // fill expression trees
      for (Int_t k=0; k<numdata; k++) {
         expr[k]->SetUnitID(unitID);
         expr[k]->SetLevel(level[k]);
         exprtree[k]->Fill();
      }//for_k
   }//for_id

// Write expression trees to file 
   for (Int_t k=0; k<numdata; k++) {
      exprtree[k]->Write("", TObject::kOverwrite, 0);
   }//for_k

   for (Int_t k=0; k<numdata; k++) {
      SafeDelete(expr[k]);
      exprtree[k]->ResetBranchAddress(exprtree[k]->GetBranch("ExprBranch"));
      SafeDelete(exprtree[k]);
   }//for_k
}//DoMultichipExpress

Here numunits can be 1000000 and numdata is 23000 trees.

Do you think I should use the following code:

DoMultichipExpress(Int_t numdata, TTree **datatree)
{
// Create new trees exprtree
   exprtree = new TTree*[numdata];
   expr     = new XGCExpression*[numdata];

   for (id=0; id<numunits; id++) { 
      DoSomething();
      // fill expression trees
      for (Int_t k=0; k<numdata; k++) {
         expr[k]->SetUnitID(unitID);
         expr[k]->SetLevel(level[k]);
         exprtree[k]->Fill();

//=> add lines:
         if (id%10000 == 0) {
            exprtree[k]->AutoSave("FlushBaskets");
            exprtree[k]->DropBaskets();
         }//if         
      }//for_k
   }//for_id

// Write expression trees to file 
   for (Int_t k=0; k<numdata; k++) {
      exprtree[k]->Write("", TObject::kOverwrite, 0);
   }//for_k

   for (Int_t k=0; k<numdata; k++) {
      SafeDelete(expr[k]);
      exprtree[k]->ResetBranchAddress(exprtree[k]->GetBranch("ExprBranch"));
      SafeDelete(exprtree[k]);
   }//for_k
}//DoMultichipExpress

Should I use AutoSave(“FlushBaskets”) or AutoSave()?
If I use AutoSave(“FlushBaskets”) do I still need to do DropBaskets()?
Do I still need to call exprtree[k]->Write() later?
If not can I nevertheless call exprtree[k]->Write() later?

Best regards
Christian

Dear ROOTers,

Sadly, until now I did not receive a reply to my last question, so please allow me to explain my problem in more detail:

This time I need to apply another algorithm, called “median-polish”, which takes a matrix of n rows times m columns as
input and returns a vector of size m. As you might already expect, the m columns are stored in m=25000 trees, but this
time the n columns are 50 consecutive tree entries where each tree contains about 1 million entries.

Thus I need to do the following:

  • get the 1st 50 entries 1-50 from each tree where I need to loop over 25000 trees for each entry.
  • do median-polish resulting in an array of size 25000.
  • save this array as 1st entry in each of 25000 new trees
  • get the 2nd 50 entries 51-100 from each tree where I need to loop over 25000 trees for each entry.
  • do median-polish resulting in an array of size 25000.
  • save this array as 2nd entry in each of 25000 new trees
  • etc until each new tree is filled with about 20000 entries

As you see, to loop over the tree entries I need to load 25000 baskets into RAM resulting in 32k x 25000 = 800 MB RAM.
Furthermore, to save each result as entry in a new tree I need to create another 25000 baskets using also 800 MB RAM.
On a typical computer with 2 GB RAM this means that loading the next 25000 baskets will crash the program, is this correct?

Thus before loading the next baskets I need to remove the current baskets from memory.

As far as I understand this means that in the case of reading tree entries I need to call tree->DropBaskets() after
looping over 32768 entries, is this correct?

In the second case of filling the new trees I need to call tree->AutoSave() for every 32768 entries, is this correct?
Is calling tree->AutoSave() sufficient or do I need to call tree->AutoSave(“FlushBaskets”)?
As far as I unerstand I do not need to call tree->DropBaskets() afterwards?

There is also a parameter “fMaxVirtualSize”. Thus would treee->SetMaxVirtualSize(32768) result in the same effect as above?
Would calling treee->SetMaxVirtualSize(32768) be sufficient?

My other problem is that although I can write a macro which is able to do this, I do not see how I could test such a
macro if it really does what I expect. (I do not have the equipment to test this with a couple of thousand trees.)
Do you have any ideas?

Best regards
Christian

Hi Christian,

Please be patient :slight_smile:. In July and August, for the obvious reasons :slight_smile:, it takes us a little longer than usual to process the questions (especially those that require careful analysis).

Cheers,
Philippe.

Dear Philippe,

Thank you for taking care and apologies for not being patient. I was afraid that you missed my former reply
or that you thought you have already answered my question before.

In any case I had also the feeling that my additional question was not satisfactory and so I tried to rephrase it.
I hope that my explanation of the steps needed for median-polish shows you more clearly what my current problem is.

Please take your time and once again my apologies.
Best regards
Christian

[quote]My other problem is that although I can write a macro which is able to do this, I do not see how I could test such a
macro if it really does what I expect. (I do not have the equipment to test this with a couple of thousand trees.) [/quote]Why not? Couldn’t you generate 25000 simulated input TTree?

[quote]On a typical computer with 2 GB RAM this means that loading the next 25000 baskets will crash the program, is this correct?
[/quote]Not necessarily, it depends on the amount of swap space and whether the ‘old’ basket are retained or not.

[quote]There is also a parameter “fMaxVirtualSize”. Thus would treee->SetMaxVirtualSize(32768) result in the same effect as above?
Would calling treee->SetMaxVirtualSize(32768) be sufficient? [/quote]Yes, this should insure that there is only one 32k bakset per branch per tree in memory.

Cheers,
Philippe.

Hi Christian,

On the input side, you should be able to make sure that there is only one TTree which has its basket in memory at a time (I.e. using DropBaskets after reading the information) without significant performance loss.

On the output side, even-though you also could (by calling AutoSave(“flushbaskets”) after each and every fill), this would be at the cost of a significant performance degradation (there would be only one event per basket, significantly file fragmentation and the size of the meta data in the TTree). Note that during writing, the branches only keep the last basket in memory. Since you need to essentially always write the 25000 output TTrees simultaneously, your only solution is to decrease the size of the buffer. For example, turning it down from 32k to 4k (hence reducing the memory need of the output trees down to 100Mb (if only one branch per TTree). (You can also turn this basket size down for the input tree, hence gaining the space there even without calls to DropBaskets).

Cheers,
Philippe.

Dear Philippe,

Thank you for your comments which are very helpful.

To be sure that I understand your comments correctly please let me summarize:

  • when reading trees calling tree->SetMaxVirtualSize(32768) will make sure that only one 32k basket is kept in memory.

  • when reading trees I can call tree->DropBaskets() w/o significant performance loss.

  • when writing trees the branches only keep the last basket in memory, but:
    is this already the case when calling tree->Fill() or does this only happen when calling tree->Write()?
    (when looping over 1 million entries to fill trees I cannot call tree->Write() within the loop)

  • decreasing the buffer size for the output trees is a very good idea.
    (I need to find a formula to adjust the buffer size dependent on the number of trees)

Best regards
Christian

[quote]- when reading trees calling tree->SetMaxVirtualSize(32768) will make sure that only one 32k basket is kept in memory.
[/quote]Not, quite; the mechanism steered by SetMaxVirtualSize always leaves one basket (if there is one already loaded) per branch.

[quote]- when reading trees I can call tree->DropBaskets() w/o significant performance loss. [/quote]That is as long as you call it at the end of reading each TTree (if you do it after each GetEntry, this will of course cost you a lot of re-reading the file).

[quote]is this already the case when calling tree->Fill()[/quote]Yes.

[quote]I cannot call tree->Write() within the loop[/quote]Indeed, this would be very bad for performance.

Cheers,
Philippe.

Dear Philippe,

Thank you for this clarification.

Since you confirm that when calling tree->Fill() the branches only keep the last basket in memory,
please allow me to suggest to update the help file for Tree::Fill() in the class index since I was
missing this information.

Best regards
Christian