ROOT Version: 6.28/04 and 6.22/00
Platform:
CENT OS 7 on CERN CVMFS
NAME=“CentOS Linux”
VERSION=“7 (Core)”
ID=“centos”
ID_LIKE=“rhel fedora”
VERSION_ID=“7”
PRETTY_NAME=“CentOS Linux 7 (Core)”
gcc version 8.3.0 (GCC)
Hi,
I am using TTree->MakeClass() to generate analysis skeleton for a TTree/TChain. The size of input root files I have to process are in the order of O(100) GB ~ 800 GB. The input root file has Tree which contains 3D arrays. 4 x 64 x 48 and ~ 4 million entries.
What I want to do is to decompose this nested array into flat ntuple(in a single event loop) so that those ntuples can later be processed with python. 4 root files are to be created(for first dimension), each with 64(second dimension) TNtuples which have 48(third dimension) branches holding floats.
I am using TChain to load input files. I define 4 CPP arrays(size 64) of TNtuples(with 48 branches) to write to individual output files. In the event loop, I fill each TNtuple with the vector of floats of size 48. In this way, in 1 iteration of event loop there are (256 = 64 x 4) TNtuples holding vector of floats. When the number of events increases, the memory occupied by these TNtuples object becomes very large. They use all of my RAM(~10 GB) and Swap memory and the operating system kills the process in about ~500,000 entries processed.
I tried TNtuple->AutoSave(“SaveSelf”) but it seems it does not clear the ntuples after saving to output root file.
I also tried TTree::SetMaxTreeSize(N); with N being ~2 GB but this also does not seem to clear the TNtuples from memory. It only creates new root file and writes to those files.
I have also tried disabling the branches I do not need in combination to the above procedures but it does not help.
I also tried to use TNtuple->Reset() after writing events at certain interval but the problem is that it creates many TNtuples cycles in the file everytime I call TNtuple->Fill().
What I would like to do is: Write events to output root file periodically at certain interval(~100,000) and clear TNtuples in EventLoop (so the memory occupied by TNtuples is cleared and also there should be only one cycle of Tree in output files) and then continue with the event loop.
Thank you very much!