Hi, I’m trying to merge some .root files, but hadd is giving me a seg fault. I tried with ROOT 5.34 and 6.01, both on OSX 10.9 compiled with clang++ from XCode, all dependencies except GSL are provided by MacPorts.
The error I get with the seg fault is:
hadd -f0 -O run00500_h.root run00094_h.root run00121_h.root run00134_h.root run00153_h.root run00160_h.root run00161_h.root run00163_h.root run00164_h.root
hadd Target file: run00500_h.root
hadd Source file 1: run00094_h.root
hadd Source file 2: run00121_h.root
hadd Source file 3: run00134_h.root
hadd Source file 4: run00153_h.root
hadd Source file 5: run00160_h.root
hadd Source file 6: run00161_h.root
hadd Source file 7: run00163_h.root
hadd Source file 8: run00164_h.root
hadd Target path: run00500_h.root:/
hadd(43978,0x7fff7cf9b310) malloc: *** error for object 0x7f8a0e324f38: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Abort trap: 6
I run this in my debugger (lldb), and I get this additional information:
hadd(44088,0x7fff7cf9b310) malloc: *** error for object 0x101760420: incorrect checksum for freed object - object was probably modified after being freed.
*** set a breakpoint in malloc_error_break to debug
Process 44088 stopped
* thread #1: tid = 0xa1df04, 0x00007fff9024a866 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
frame #0: 0x00007fff9024a866 libsystem_kernel.dylib`__pthread_kill + 10
libsystem_kernel.dylib`__pthread_kill + 10:
-> 0x7fff9024a866: jae 0x7fff9024a870 ; __pthread_kill + 20
0x7fff9024a868: movq %rax, %rdi
0x7fff9024a86b: jmpq 0x7fff90247175 ; cerror_nocancel
0x7fff9024a870: ret
(lldb) bt
* thread #1: tid = 0xa1df04, 0x00007fff9024a866 libsystem_kernel.dylib`__pthread_kill + 10, queue = 'com.apple.main-thread', stop reason = signal SIGABRT
* frame #0: 0x00007fff9024a866 libsystem_kernel.dylib`__pthread_kill + 10
frame #1: 0x00007fff9515735c libsystem_pthread.dylib`pthread_kill + 92
frame #2: 0x00007fff92c94b1a libsystem_c.dylib`abort + 125
frame #3: 0x00007fff94223690 libsystem_malloc.dylib`szone_error + 587
frame #4: 0x00007fff94221595 libsystem_malloc.dylib`szone_free_definite_size + 3011
frame #5: 0x00000001008429f2 libTree.so`TLeafI::~TLeafI() + 50
frame #6: 0x0000000100dbfc88 libCore.so`TObjArray::Delete(char const*) + 136
frame #7: 0x00000001008057a9 libTree.so`TBranch::~TBranch() + 329
frame #8: 0x000000010080560e libTree.so`TBranch::~TBranch() + 14
frame #9: 0x0000000100dbfc88 libCore.so`TObjArray::Delete(char const*) + 136
frame #10: 0x000000010084dd81 libTree.so`TTree::~TTree() + 385
frame #11: 0x000000010084db4e libTree.so`TTree::~TTree() + 14
frame #12: 0x000000010087fb7e libTree.so`ROOT::delete_TTree(void*) + 46
frame #13: 0x0000000100dd67f6 libCore.so`TClass::Destructor(void*, bool) + 70
frame #14: 0x000000010003600c libRIO.so`TFileMerger::MergeRecursive(TDirectory*, TList*, int) + 7276
frame #15: 0x0000000100036495 libRIO.so`TFileMerger::PartialMerge(int) + 533
frame #16: 0x00000001000026ac hadd`main + 5308
Unfortunately hadd isn’t compiled with -g it seems, so the specific lines of code can’t be seen. If anyone can suggest how to debug further, let me know. Is it possible to re-compiled hadd only, leaving the rest of ROOT as-is? Then I could make with -g.
Incidentally, the produced file is 16GB before the seg fault, with 593253 entries in the TTree. The input file’s TTrees had a combined number of entries of 593253, so it seems the full TTrees are being combined, but maybe the closing of the file is a problem? Or maybe it’s the other objects in the TFiles that are problematic. Can I tell hadd just to merge the TTrees and ignore other objects? The other objects are TObjArrays and TH1Fs.
Jean-François