I’m using ROOT 5.28 with prooflight (but I’ve similar problems in proof). I’m running a stupid selector with ~300 input files in a single TChain for a total of ~40Gb. As you can see in the figure the code is too stupid because in the process I put return kTrue at the begin, so the code is really doing nothig. So why does it use all the memory (~6Gb on a 4 core machine)?
I tried also to put the return kTrue in the correct position, at the end of the Process method, by the behaviour is the same.
I had a quick look at your file. Just opening it in a simple ROOT session and doing GetEntries() on the Tree ends up to use 1.5 GB of memory. However, when the tree is deleted and the file closed (what PROOF does during validation and processing) the memory is correctly released.
I think the problem is somewhat related to the tree. I will continue to investigate and let you know.
The issue stems from the way the file was written. As can be seen in this:root [0] TFile *file0 = TFile::Open("output_skimmed_00370.root");
root [1] file0->Map();
20101221/213839 At:100 N=146 TFile
20101221/213839 At:246 N=289 TH1F CX = 2.07
20101221/213915 At:535 N=330436238 TTree CX = 3.20
20101221/213921 At:330436773 N=6348 StreamerInfo CX = 3.42
20101221/213921 At:330443121 N=166 KeysList
20101221/213921 At:330443287 N=69 FreeSegments
20101221/213921 At:330443356 N=1 ENDWhere was is missing are the information about baskets. In other the words, in this file all the baskets have been stored with the TTree object. In consequence, whenever you load the TTree you have to load all the data even to just get the number of entries (Hence the huge peaks of memory use).
How did you write those files? With older version of ROOT, you could create this type of file by creating a TTree ‘in memory’ and then open a new file and store the TTree. A priori in v5.28, this is no longer possible.
Once we manage to avoid this unusual type of file layout, the memory peak will disappear.
Cheers,
Philippe.
PS. I was able to re-create the file with the usual layout by simply doing:hadd -f output_skimmed_00370_fixed.root output_skimmed_00370.root
Target file: output_skimmed_00370_fixed.root
Source file 1: output_skimmed_00370.root
Target path: output_skimmed_00370_fixed.root:/
output_skimmed_00370.root tree:PAUReco entries=1234567890and getting:20110202/223316 At:100 N=158 TFile
20110202/223316 At:258 N=289 TH1F CX = 2.07
20110202/223331 At:547 N=267 TBasket CX = 119.84
20110202/223331 At:814 N=25330 TBasket CX = 1.26
20110202/223331 At:26144 N=340 TBasket CX = 94.11
....
20110202/223350 At:343493364 N=17699 TBasket CX = 1.73
20110202/223350 At:343511063 N=368238 TTree CX = 3.76
20110202/223352 At:343879301 N=172 KeysList
20110202/223352 At:343879473 N=6658 StreamerInfo CX = 3.42
20110202/223352 At:343886131 N=75 FreeSegments
20110202/223352 At:343886206 N=1 END where I estimate the memory peak should now be less than 2Mb …
The issue stems from the way the file was written. As can be seen in this:root [0] TFile *file0 = TFile::Open("output_skimmed_00370.root");
root [1] file0->Map();
20101221/213839 At:100 N=146 TFile
20101221/213839 At:246 N=289 TH1F CX = 2.07
20101221/213915 At:535 N=330436238 TTree CX = 3.20
20101221/213921 At:330436773 N=6348 StreamerInfo CX = 3.42
20101221/213921 At:330443121 N=166 KeysList
20101221/213921 At:330443287 N=69 FreeSegments
20101221/213921 At:330443356 N=1 ENDWhere was is missing are the information about baskets. In other the words, in this file all the baskets have been stored with the TTree object. In consequence, whenever you load the TTree you have to load all the data even to just get the number of entries (Hence the huge peaks of memory use).
How did you write those files? With older version of ROOT, you could create this type of file by creating a TTree ‘in memory’ and then open a new file and store the TTree. A priori in v5.28, this is no longer possible.
Once we manage to avoid this unusual type of file layout, the memory peak will disappear.
Cheers,
Philippe.
PS. I was able to re-create the file with the usual layout by simply doing:hadd -f output_skimmed_00370_fixed.root output_skimmed_00370.root
Target file: output_skimmed_00370_fixed.root
Source file 1: output_skimmed_00370.root
Target path: output_skimmed_00370_fixed.root:/
output_skimmed_00370.root tree:PAUReco entries=1234567890and getting:20110202/223316 At:100 N=158 TFile
20110202/223316 At:258 N=289 TH1F CX = 2.07
20110202/223331 At:547 N=267 TBasket CX = 119.84
20110202/223331 At:814 N=25330 TBasket CX = 1.26
20110202/223331 At:26144 N=340 TBasket CX = 94.11
....
20110202/223350 At:343493364 N=17699 TBasket CX = 1.73
20110202/223350 At:343511063 N=368238 TTree CX = 3.76
20110202/223352 At:343879301 N=172 KeysList
20110202/223352 At:343879473 N=6658 StreamerInfo CX = 3.42
20110202/223352 At:343886131 N=75 FreeSegments
20110202/223352 At:343886206 N=1 END where I estimate the memory peak should now be less than 2Mb …[/quote]
I don’t know which version I used, because it was some time ago, and the admin changed the version in at some time, maybe 5.21. I create the file in this way inside a proof session. During process:
if (first_event)
{
output_tree = fChain->CloneTree(0);
fChain->GetTree()->CopyAddresses(output_tree);
}
if (pass_selection)
{
output_tree->Fill();
}
[quote]I don’t know which version I used, because it was some time ago, and the admin changed the version in at some time, maybe 5.21. I create the file in this way inside a proof session. During process:[/quote]With version of ROOT older than 5.20 (So I am guessing that you were using 5.18), your code would indeed lead to that kind of ROOT files. If you use v5.28 to write the TTree part of your file, the problem should be gone.
Cheers,
Philippe.
PS. output_tree = fChain->CloneTree(0);
fChain->GetTree()->CopyAddresses(output_tree);The 2nd line is/should be redundant as it is executed in the CloneTree.