Problems with hadd

Hello,
I have order 400 root files with thousands of histograms in it. I’d like to merge all together… size for each file is 100MB. If I run on a sub-set of files (12) everything is OK… but when I run on the full list I get the following error at the 40th files:

Source file 39: myQCD_170_230_8.root
Source file 40: myQCD_20_30_10.root
terminate called after throwing an instance of 'std::bad_alloc’
what(): St9bad_alloc
Abort

I’m quite sure it’s a memory problem and it depends probably from the fact that the TList needs to have all the files open at the same time… Is there another option to avoid the memory problem?

Thanks

Attilio

You are correct.
The attached macro from the STAR offline core library allows merging the unlimited number of the histograms from the unlimited number of files with no memory /performance penalty
MergeHistogramFile.C (2.78 KB)
MergeHistogram.tar.gz (3.16 KB)

Valeri,

I looked at your macro and associated file. It is a very naive macro with a lot of limitations (eg cannot merge files with subdirs) and I do not see how it could solve the original problem.

Now back to the problem. Could you give some more details?
-version number of ROOT
-are your files having the same histograms? or different histograms inside each file?
-Do you have large TH2 or TH3 histograms?
-Do you have only histograms or also trees (eg, memory resident Trees)?
-Could you post somewhere 1 file such that we understand what you have inside?

Rene

[quote=“Rene”]I looked at your macro and associated file. It is a very naive macro with a lot of limitations (eg cannot merge files with subdirs) [/quote]Yes, you are correct.

It is a simple short macro to highlight the main idea.
The real one from the STAR production does handle all cases you mentioned.
However, I thought it would be useful to show the simplest one.

[quote=“Rene”]I do not see how it could solve the original problem. [/quote]It doesnot matter how many files the user wants to merge. The macro keeps open only one ROOT file at time.

[quote=“Rene”]Now back to the problem. Could you give ome more details?
-version number of ROOT
[/quote]STAR production ROOT version is 5.12. It works for any ROOT version I tested (that includes the “trunk” version)

[quote=“Rene”]-are your files having the same histograms? or different histograms inside each file?[/quote]It can be different.
Normally, (in real STAR life) they are “almost” the same though. [quote=“Rene”]-Do you have large TH2 or TH3 histograms? [/quote]Yes, we do.

The histograms are memory resident objects and normally they all together fit the memory at the time they are filled. I was not reported yet any problem. In theory, I can imagine the hypothetical case when one wants to merge two files with the completely different sets of the huge histograms those do not fit the memory together. I can imagine the attempt to merge two identical 2Gb large histograms objects too. However, I have not ever see it in real life. Anyway these examples have nothing to do the with the problem the macro shows the solution for. Merging many small histograms from the hundreds of the ROOT files at once.

[quote=“Rene”]-Do you have only histograms or also trees (eg, memory resident Trees)? [/quote]We are speaking about ROOT files merging. I can not understand where the “memory resident Tree” may come from for this “use case”.
So, to answer your question. None of my user has asked to help him/her to solve any issue with merging “memory resident” Tree yet.

[quote=“Rene”]-Could you post somewhere 1 file such that we understand what you have inside? [/quote]As you had mentioned the attached macro can merge the simple “flat” ROOT files. What do you want me to post?
The real STAR files would have required the real more complex macro. It can be posted too if needed. However, the vast majority of our files are quite simple and can be merged by the macro in question. The problem usually deals with the number of the files to merge. They count them by 1000’s and want to merge all of them in one shot. May be it is not the best approach. However, it is not me to decide what the end-user sees fitting his/her needs. They come up with the concrete technical issue seeking the assistance

May be you could give me your files you could not handle yet to see whether it can be processed by STAR.
.

Valeri,

My questions were more for as5365 than for you. I do not see how your micro version of hadd could solve his problems.

Generally speaking, I dont’t think it is a good idea to post to this forum solutions that are much inferior to existing tools and will only create additional problems to the end user.

I am still waiting answers to my question from as5365.

Rene

Hi,
the problem is a general one: for this particular case I have only TH1 histograms (but also TH2, TProfile and tree could exist sometimes in the future…). At the moment there are 320.000 histos in each file. All of them are equal (it’s just the same job splitted in a grid environment… I need to process millions of events and the typical situation is to have order of 100 or more of these files).

An example of one files is here:
cms.pg.infn.it/santocchia/files/ … 70_13.root

Because of this limitation I tried to change the numer of events in each job:
5.000.000 events
originally 20.000 events each job --> 250 jobs
now 200.000 events each job --> 25 jobs

Also 25 files are probably too many (depends on the memory of your machine) and I need to go through 2 passes (1-12 + 13-25 --> first pass and then a merge between the 2 obtained files)

Valeri: I tried to implement your solution but I got several problems: first of all, the TDirIter.h didn’t work properly and when I tried to compile (after few editing) I also got an error:

g++ -O -m32 MergeHistogramFile.o -L/afs/cern.ch/sw/lcg/external/root/5.17.04/slc4_ia32_gcc34/root/lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lPostscript -lMatrix -lPhysics -pthread -lm -ldl -rdynamic -lEG
-o MergeHistogramFile
MergeHistogramFile.o(.text+0x227): In function main': /localscratch/s/santocch/histoDir/tmp/MergeHistogramFile.C:68: undefined reference toTFileIter::TFileIter(char const*, char const*, char const*, int, int)'
MergeHistogramFile.o(.text+0x36d): In function main': /afs/cern.ch/sw/lcg/external/root/5.17.04/slc4_ia32_gcc34/root/include/TFileIter.h:208: undefined reference toTFileIter::~TFileIter()'
MergeHistogramFile.o(.text+0x380):/afs/cern.ch/sw/lcg/external/root/5.17.04/slc4_ia32_gcc34/root/include/TFileIter.h:208: undefined reference to `TFileIter::~TFileIter()'
collect2: ld returned 1 exit status
make: *** [MergeHistogramFile] Error 1

I was able to run it interactively but it was way too slow wrt hadd (if I’m using a reduced number of files)

Last: I tried it with different version of root and I got the same result. Just to fix one version I finally used 5.17.04

Thanks for helping anyhow…
Ciao

Attilio

[quote=“as5365”]Hi,

g++ -O -m32 MergeHistogramFile.o -L/afs/cern.ch/sw/lcg/external/root/5.17.04/slc4_ia32_gcc34/root/lib -lCore -lCint -lRIO -lNet -lHist -lGraf -lGraf3d -lGpad -lTree -lRint -lPostscript -lMatrix -lPhysics -pthread -lm -ldl -rdynamic -lEG
-o MergeHistogramFile
MergeHistogramFile.o(.text+0x227): In function main': /localscratch/s/santocch/histoDir/tmp/MergeHistogramFile.C:68: undefined reference toTFileIter::TFileIter(char const*, char const*, char const*, int, int)'
Attilio[/quote]You missed “-lTable” library the class in question belongs to. (see $ROOTSYS/table/inc)
I am downloading your file to test the macro.

I looked at your file containing 320000 histograms in one single directory.
When merging files (in particular histogram files) it is essential to have in memory a copy of all the objects for one file. It would be time disaster to have only one at a time, reading the corresponding histogram from another file, storing the current status to a temp file and repeating this operation for all histograms.

I believe that storing 320000 histograms is definitively the wrong solution. You should use a TTree instead. Many advantages
-much less space in memory (up to a few Megabytes instead of hundreds of Megabytes)
-much faster (direct access to Tree entries instaed of hashtable with histograms
-more compact file (my guess at least a factor 2
-no problems in merging as many files as you like

or at least make one Tree per histogram type.

Rene

[quote=“as5365”]Hi,
the problem is a general one: for this particular case I have only TH1 histograms (but also TH2, TProfile and tree could exist sometimes in the future…). At the moment there are 320.000 histos in each file.[/quote]Mm Assuming you can merge them what do you want to do with 320.000 different distributions :imp: (histograms) indeed. What may have been your next step?
Can you show us how you did manage to write them out at the first place. I think you should look for some solution at that stage.

[quote=“fine”][quote=“as5365”]Hi,
the problem is a general one: for this particular case I have only TH1 histograms (but also TH2, TProfile and tree could exist sometimes in the future…). At the moment there are 320.000 histos in each file.[/quote]Mm Assuming you can merge them what do you want to do with 320.000 different distributions :imp: (histograms) indeed. What may have been your next step?
Can you show us how you did manage to write them out at the first place. I think you should look for some
solution at that stage.[/quote]
I have printed the number of entries for the histrograms from your file and discovered all of them are almost EMPTY :unamused: !!!

All of the histo are booked using the following code:

   for (Int_t i0=0; i0<dimNalgo ; i0++)
     for (Int_t i1=0; i1<nBinEta ; i1++)
            for (Int_t i2=0; i2<nBinEt ; i2++) {
              hDist_B[i0][i1][i2]= new TH1F(Form("hDist_B_%d_%d_%d",i0,i1,i2),"",100,0,2 );
              hDist_C[i0][i1][i2]= new TH1F(Form("hDist_C_%d_%d_%d",i0,i1,i2),"",100,0,2 );
              hDist_Q[i0][i1][i2]= new TH1F(Form("hDist_Q_%d_%d_%d",i0,i1,i2),"",100,0,2 );
              hDist_G[i0][i1][i2]= new TH1F(Form("hDist_G_%d_%d_%d",i0,i1,i2),"",100,0,2 );
            }

I analyze let’s say 5M events. For each event I have 2 jets. For each jet I get one entry which fills one (and only one) of the 320.000 histo depending on the algo used for the jet, et, eta and flavour of the jet.

It’s not clear to me how to implement the same functionality in a tree. Also because I need to fit all of these histos with a guassian function. I guess I can try to write a tree which save the same infos… Just to be sure I understood correctly:

  1. I save a tree with a sort of structure like this:
    myTree1[nAlgo][nEta][nEt][nFlav][nBin] where:
    nAlgo=0 --> 7
    nEta=0 --> 50
    nEt=0 --> 200
    nFlav=0 --> 3 (850200*4=320.000 the famous number of histos)
    nBin=0 --> 100 (100 bin each histo)
    and I increase of 1 unit each time I get an entries for that particular bin
    This structure mimic exactly the histos.
    I have exactly 32M entries in the tree and the merging produce only an increase in the values saved in the tree but there is no change in the total number of entries
  2. This structure in a tree allow to save space (I really don’t care)
  3. This structure allows for a much faster merging of many files (this is really important)
  4. After the merging I can create the famous 320.000 histos from the tree and fit each histo with my gaussian
  5. Plot the result and I’m happy :smiley: !

Is this the best way to prooceed? if you confirm it I can start right now to build the tree and test this solution…

But there is another possibility:

I build the same tree but the last variable:
myTree2[nAlgo][nEta][nEt][nFlav] = theVariableIuseToFillHisto
This option give me 2 entriy for each event (for 5M events -> 10M entries considering I have 2 entries, i.e. 2 jets, from each analyzed event). This number increase when I merge all the events and it depends on the number of events I’m analyzing… It mean also that the final file can be really big (depends on the number of entries)

Then I scan the full tree, I build the histos, fit and that’s all.

I’m a little bit confused…

Thanks

Attilio

For Valeri:

this is only one of many other file which fills each of the bin. The use is the following:
I need to calculate the parton calibration for
1)different jet algo (8 in this case: iterative cone 0.5, 0.7, kt, midpoint…)
2)different flavour (b,c,q,g)
Mapping the eta (50 bin from 0 to 5) et (200 bin from 0 to 1000GeV) space.
For each jet I match it with the originating parton (if DeltaR<0.15 for example) and plot the ETjet/ETparton distribution. Fit it with a gaussian, get the mean and plot the response function…Have a look here for something already done with this procedure…
CMS NOTE2006_059.pdf
cms.cern.ch/iCMS/jsp/openfile.js … 06_059.pdf

The problem is now that I’d like to automatize most of the work because of an higher number of jet algos, a more in deep analysis of flavour effects (in the note I just studied b and all the others… now I’d like to see differences between b,c,g, and light q) and also different implementation of fragmentation models…

So the high number of histos… and you have few histo filled in the previous file because there were only 10000 events in that file… I need millions to map the full space…

I hope to have given to you an idea of my problems…

Ciao

Attilio

What I see

  1. The macro does merge the files with no crash.
  2. It is slow .

To understand why, one has to find the bottleneck.
Let’s estimate the time the code spends to find out what histogram the current one has to be added to. To do that, it has to look up the 320000 long list to find the counterpart. I added some benchmarking and found that the piece of the code:

if ( (dstHistogram = (TH1 *)outFile->FindObject(h1->GetName()))) { // Accumulate the histogram dstHistogram->Add(h1); delete h1; // Optional, to reduce the memory consumption } requires about 3e-4 sec per histogram. This means to merge 400 files one needs ((3.2e+5) * (3e-4) * (4e+2))/3.6e3 = 10 CPU hours just to put all your histograms together. This time does not include (yet) the time to read (I/O), uncompress, and de-serialize object. This means you cannot go faster then that. It should be twice as slow as that. Does the magnitude of these numbers match your observation? My machine is 2.8 GHz Pentium IV and I applied NON-optimized ROOT code. I do not think the compilation of the macro itself contributes much to the overall performance. What does mean? That means one should try to reduce the size of the list. Can you split your 320000 by several files and merge them separately? This way you can gain the performance. The search time can be significantly reduced (assuming the lookup time grows exponentially with the list size) and one can deploy the several CPUs at once. Anyway you should investigate (with that macro for example) the possible bottle necks carefully if you do want to move on with your approach. I’ll check what can be done to speed the process up. (Feel free to communicate me in person fine@bnl.gov).

I have optimized hadd in the SVN trunk for the case of files with so many objects.
Using your file, I made 4 copies. Merging the 4 files takes 6 minutes on my Linux box.
My recommendations for you if you want to go on with so many histograms are:
-change TH1F to TH1S (or even TH1C if possible)
-Merge 4 files (or 10 if you have enough RAM)
-Merge the merged files

ie merging 400 files with 320000 hists in each file can be made in 6100 minutes +625 minutes +6*6 minutes = 15 hours.

the standard hadd can merge files containing histograms and trees. The list of histograms or Trees may be different in each file.

Rene

Hi all,
thanks for helping… I understood that a key point for me is to automatize as much as possible the whole procedure…
I will do few test to see what is the best approach… just to let you know I’m now writing a tree and process all the entries (more than 10M) producing the histograms… then I will try the new version of hadd and also what Fine suggested… give me few days (I think now I have all I need) and I will report my experience and the best choice for me.
Ciao and thanks again!

Attilio

Hi all,
I spent some times in trying to understand what is the best solution for my problem and the answer is definitely to use trees. I had some improvements with the new hadd version but it’s way too slow wrt trees. In fact what I did is to write the tree (less than 1Gb for 5 floats times 100M entries) and I read the full tree writing the full histograms (the same I get merging 400 files). I was able to have the final histograms (8 files… one each jet algo) in less than one hour…

So thanks for helping and for the good suggestions.
Ciao

Attilio