Hadd very slow when target file becomes large

Hi,

I am trying to merge 400 files each with around 7MB using hadd in ROOT version 5.34/07.

The first 20 files are done within 10 mins, but then it slows down quickly and takes hours to merge 1 or 2 more files when the file size of the target file reaches around 100~200 MB.

I opend 3 files at a time and used 3GB of memory to run the job. And using 6GB memory does not improve the situation.

Any help or information will be greatly appreciated.

Martin

The file structure of each file is like this:

  KEY: AttributeListLayout	Schema;1	
  KEY: TDirectoryFile	physicsMeta;1	physicsMeta
  KEY: TDirectoryFile	Lumi;1	Lumi
  KEY: TTree	CollectionTree;1	CollectionTree
  KEY: TTree	physics;1	physics

And I am not sure if the following warning is related to the fact that the merging is always stucked at one of the directories, but not the trees:

Warning in <TClass::TClass>: no dictionary for class AttributeListLayout is available
Warning in <TClass::TClass>: no dictionary for class pair<string,string> is available

The compression level is set to be matched with:

hadd -f6 -n 3 targetfile sourcefile1 sourcefile2 sourcefile3 ...

Hi!

Try to merge in “chunks” of e.g. 10 files: 400 -> 40 -> 4 ->1.

Best,

Sebastian

PS: rootpy’s hadd wrapper has on option for that, see

https://github.com/rootpy/rootpy/blob/master/scripts/root-hadd

hi Sebastian,

Thanks for your reply!
That’s what I first thought of too.
I can merge 400 -> 40 easily, but when it comes to 40->4, the program becomes slow once the file get large again.

Plus, that four files will be doubl the size they should be, i.e.
7 MB per file
40 files of ~70 MB
4 files will be 1400 MB instead of 700MB.

So, I didn’t manage to get to the 4->1 step.

Martin

Hi!

Can you make the files available?

Best,

Sebastian