I hope this is not an already asked question and this is the right place to post.
I have a very large number of files containing histograms (~1800). Each file is less than 100 KB and I have to merge them to be further processed by an ATLAS tool.
My approach was to run something like this:
hadd out.root ./root-files/*root
the stdout content was usual, it counted the number number of files and, after a very long time (more than an hour) it finally produced the out.root file.
When I ran the tool over the file, I have experienced a segmentation fault due to the corrupted header of the file.
I have then tried to create several intermediate output files, each resulting in merging 300 of the original ones.
I have then created the final file from the intermediate ones and it worked smoothly.
My question is:
Is there a limit over the number of files to be merged? Is this a known issue? If yes, would it be possible to limit over the number of input files or introduce a sanity check over the final output?
thanks for your reply. I think the issue is quite different.
I am using Root 6.04 and the error I get when I try to access one of the histograms contained in the “corrupted” file is:
Error R__unzip_header: error in header
Was this problem ever solved? I have the same issue when using hadd over more than ~1000 files (each file is between 200 and 400 KB). Specifically, when I try to access any histogram in the corrupted output, I get this error:
Error R__unzip_header: error in header.
My workaround right now is: 1) divide the file set in a few subsets, each with less than 1000 files; 2) hadd each subset separately; 3) hadd the subset outputs together. This works, but it’s quite annoying and time consuming…