Dear experts,
I am currently trying to use hadd to make an output file with a tree with size > 100 Gb.
I read:
I want to merge 30 ~15Gb files containing an TNtuple using hadd, but get the following error:
Fatal in <TFileMerger::RecursiveRemove>: Output file of the TFile Merger (targeting xxx.root) has been deleted (likely due to
a TTree larger than 100Gb)
Indeed, the target file is 100Gb when the error occurs. The files are generated from simulation data that writes an entry every break point.
I have 512Gb of RAM on my server, and I don’t want to slow down hadd by using more compression if I can hel…
It seems that in my case the easiest solution is to create in the directory I execute “hadd” a rootlogon.C file with:
TTree::SetMaxTreeSize( 1000000000000LL ); // 1 TB
However, it seems that when I run “hadd” this new option is not taken into account. I tried two commands:
I guess this is because hadd is not recompiled. Please could you tell me what the easiest way to take the new MaxTreeSize number into account in hadd?
I would prefer not to write a separate macro replicating the hadd behavior if possible.
Thanks for your help
Best wishes
Indeed hadd does not load the rootlogon file. However you can still manager using the following pattern.
Use a source file (let’s called it startup.C) like
#include "TTree.h"
int startup() {
TTree::SetMaxTreeSize( 1000000000000LL ); // 1 TB
return 0;
namespace {
static int i = startup();
compile it into a shared library, for example with:
root.exe -b -l -q startup.C+
and then use LD_PRELOAD (DYLD_INSERT_LIBRARIES on MacOS) to preload that library: hadd output.root input_* hadd output.root input_*
1 Like
Dear Philippe,
thank you very much for your message, this is very helpful!
In addition, let me copy/paste here a little python script a colleague passed me which does the job also in case people visit this thread (*).
Thanks again for your help
Best wishes
import ROOT
import os, sys
print 'Merging %s' % sys.argv[1]
print "Max tree size",ROOT.TTree.GetMaxTreeSize()
ROOT.TTree.SetMaxTreeSize(200000000000) # 200 Gb
print "Updated tree size",ROOT.TTree.GetMaxTreeSize()
rm = ROOT.TFileMerger(False)
path = 'mypath'
file_output = '%s.root' % sys.argv[1]
file_list = []
for path, dirs, files in os.walk(path):
for filename in files:
if ('%s_part' % sys.argv[1]) in filename: file_list.append(path+filename)
print "Input file list:",file_list
print "Output file:",file_output
for F in file_list:
print "Adding ->",F
1 Like
May 16, 2017, 7:27am
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.