I need to reorganise a large ntuple and my straight-forward implementation gives a poor performance. I would be happy for any hint how to improve it.
I have a large ntuple split into about 10 ROOT files. Each of the files is stored on EOS, has about 5 GB and 3 M events. Later, I’d need to consider event larger ntuples: with about 100 such ROOT files. The reorganisation I’m interested in applies a mapping and combines data from some branches in a file and in an event with data from other branches in another file and another event. The output is a ntuple with exactly the same structure as the input one.
I’ve attached my simple implementation. Basically, it loads all the TTree’s in the beginning and then traverses the mapping, using SetBranchStatus to load appropriate branches from appropriate tree/events. Each recombined event is “Filled” into the output TTree. Running the code on LXPLUS, 50 events and with reduced list of branches gives this timing characteristics:
real 2m43.210s user 0m5.630s sys 0m3.062s
which essentially exclude using this approach.
Many thanks in advance for your tips,
recombine.cc (3.22 KB)