Problem with Hadd of TTrees

Hi everyone,

First time here…

Would much appreciate your help with a strange problem I encountered, trying to hadd together TTrees (42 files, one TTree per file, ~30GB in total).

After 6 hours, it ends with this error:
‘’’
[gsela@lxplus718 ~]$ $ROOTSYS/bin/hadd -f2 ./Final_Trees/JetFakesMc.root ./JetFakesMc/*.root
hadd Target file: ./Final_Trees/JetFakesMc.root
hadd compression setting for all output: 2
hadd Source file 1: ./JetFakesMc/mc16a_Diboson_lh_1.root
hadd Source file 2: ./JetFakesMc/mc16a_ggHtt_lh_1.root
hadd Source file 3: ./JetFakesMc/mc16a_ggHWW_lh_1.root
hadd Source file 4: ./JetFakesMc/mc16a_Top_lh_1.root
hadd Source file 5: ./JetFakesMc/mc16a_ttHtt_lh_1.root
hadd Source file 6: ./JetFakesMc/mc16a_VBFHtt_lh_1.root
hadd Source file 7: ./JetFakesMc/mc16a_VBFHWW_lh_1.root
hadd Source file 8: ./JetFakesMc/mc16a_WHtt_lh_1.root
hadd Source file 9: ./JetFakesMc/mc16a_W_Jets_lh_1.root
hadd Source file 10: ./JetFakesMc/mc16a_ZHtt_lh_1.root
hadd Source file 11: ./JetFakesMc/mc16a_ZllEWK_lh_1.root
hadd Source file 12: ./JetFakesMc/mc16a_ZllQCD_lh_1.root
hadd Source file 13: ./JetFakesMc/mc16a_ZttEWK_lh_1.root
hadd Source file 14: ./JetFakesMc/mc16a_ZttQCD_lh_1.root
hadd Source file 15: ./JetFakesMc/mc16d_Diboson_lh_1.root
hadd Source file 16: ./JetFakesMc/mc16d_ggHtt_lh_1.root
hadd Source file 17: ./JetFakesMc/mc16d_ggHWW_lh_1.root
hadd Source file 18: ./JetFakesMc/mc16d_Top_lh_1.root
hadd Source file 19: ./JetFakesMc/mc16d_ttHtt_lh_1.root
hadd Source file 20: ./JetFakesMc/mc16d_VBFHtt_lh_1.root
hadd Source file 21: ./JetFakesMc/mc16d_VBFHWW_lh_1.root
hadd Source file 22: ./JetFakesMc/mc16d_WHtt_lh_1.root
hadd Source file 23: ./JetFakesMc/mc16d_W_Jets_lh_1.root
hadd Source file 24: ./JetFakesMc/mc16d_ZHtt_lh_1.root
hadd Source file 25: ./JetFakesMc/mc16d_ZllEWK_lh_1.root
hadd Source file 26: ./JetFakesMc/mc16d_ZllQCD_lh_1.root
hadd Source file 27: ./JetFakesMc/mc16d_ZttEWK_lh_1.root
hadd Source file 28: ./JetFakesMc/mc16d_ZttQCD_lh_1.root
hadd Source file 29: ./JetFakesMc/mc16e_Diboson_lh_1.root
hadd Source file 30: ./JetFakesMc/mc16e_ggHtt_lh_1.root
hadd Source file 31: ./JetFakesMc/mc16e_ggHWW_lh_1.root
hadd Source file 32: ./JetFakesMc/mc16e_Top_lh_1.root
hadd Source file 33: ./JetFakesMc/mc16e_ttHtt_lh_1.root
hadd Source file 34: ./JetFakesMc/mc16e_VBFHtt_lh_1.root
hadd Source file 35: ./JetFakesMc/mc16e_VBFHWW_lh_1.root
hadd Source file 36: ./JetFakesMc/mc16e_WHtt_lh_1.root
hadd Source file 37: ./JetFakesMc/mc16e_W_Jets_lh_1.root
hadd Source file 38: ./JetFakesMc/mc16e_ZHtt_lh_1.root
hadd Source file 39: ./JetFakesMc/mc16e_ZllEWK_lh_1.root
hadd Source file 40: ./JetFakesMc/mc16e_ZllQCD_lh_1.root
hadd Source file 41: ./JetFakesMc/mc16e_ZttEWK_lh_1.root
hadd Source file 42: ./JetFakesMc/mc16e_ZttQCD_lh_1.root
hadd Sources and Target have different compression levels
hadd merging will be slower
hadd Target path: ./Final_Trees/JetFakesMc.root:/
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
Error in TBufferFile::WriteByteCount: bytecount too large (more than 1073741822)
‘’’

Note that:

  1. I’m using -f2 as some TTrees has extra branches that the others doesn’t have. I don’t need those, only the shared branches should be saved in the output file.

  2. The trees are created by cloning and skimming a larger tree, in the same way described in this example: ROOT: tutorials/tree/copytree3.C File Reference
    When I skim using cuts that create small trees (300MB, same file structure, just different number of events saved ), everything works fine. But when the trees are large (30GB in this case) there seems to be a problem.

  3. I don’t know if that’s normal, but some trees are saved with a large cycle number (for example, 43). I’ve tried saving using
    ‘’’
    newfile.Write("",TObject::kWriteDelete);
    ‘’’

But that had no effect.

Any thoughts or recommendations? Example files can be found in:
/afs/cern.ch/user/g/gsela/public

Thanks!


_ROOT Version: 6.18/04
Platform: lxplus

Try the “latest release”.

Hi @Galsel ,
and welcome to the ROOT forum!
I would ask @pcanal to confirm/deny what I’m saying but I believe the error is due to a limitation in the size of objects that can be written to ROOT files. Indeed (part of?) this limitation has been recently lifted so trying with ROOT v6.24.06 as @Wile_E_Coyote suggests or even a nightly build might help.

Cheers,
Enrico

Thanks guys,

I tried release 6.24/06 but unfortunately the problem persists (I’m not sure what’s the “nightly build” or how to utilize it)

Trying to hadd smaller groups, I see the problem is indeed with 3 large samples (~20GB total). Is there a problem with that size? I thought hadd should work up to 100GB, if not more.

The limitation is not per se in the size of the file (it is practically limitless, aka 64 bit addressing). However each individual I/O operation needs to be less than 1Gb. When merging a large number of trees, you might reach the point where there is too many baskets (number that is closely related to the number of entries) such that the size of the TTree object is larger than 1Gb. ((50 millions baskets leads to TTree being larger than 1Gb (20 bytes per baskets)).

We are working on lifting this limitations but it will take a few more weeks/months before it is ready to use.

Thanks!

Hi @pcanal , just to clarify…

Is this a limitation of the TTrees or hadd?

Meaning, can I work around the problem by writing a code similar to hadd? (that brute-force opens the trees and saves information to a new one by eventloop)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.