Periodically write TNtuple to ROOT file and clear TNtuple object from memory in EventLoop

ROOT Version: 6.28/04 and 6.22/00
Platform:
CENT OS 7 on CERN CVMFS
NAME=“CentOS Linux”
VERSION=“7 (Core)”
ID=“centos”
ID_LIKE=“rhel fedora”
VERSION_ID=“7”
PRETTY_NAME=“CentOS Linux 7 (Core)”
gcc version 8.3.0 (GCC)

Hi,

I am using TTree->MakeClass() to generate analysis skeleton for a TTree/TChain. The size of input root files I have to process are in the order of O(100) GB ~ 800 GB. The input root file has Tree which contains 3D arrays. 4 x 64 x 48 and ~ 4 million entries.

What I want to do is to decompose this nested array into flat ntuple(in a single event loop) so that those ntuples can later be processed with python. 4 root files are to be created(for first dimension), each with 64(second dimension) TNtuples which have 48(third dimension) branches holding floats.

I am using TChain to load input files. I define 4 CPP arrays(size 64) of TNtuples(with 48 branches) to write to individual output files. In the event loop, I fill each TNtuple with the vector of floats of size 48. In this way, in 1 iteration of event loop there are (256 = 64 x 4) TNtuples holding vector of floats. When the number of events increases, the memory occupied by these TNtuples object becomes very large. They use all of my RAM(~10 GB) and Swap memory and the operating system kills the process in about ~500,000 entries processed.

I tried TNtuple->AutoSave(“SaveSelf”) but it seems it does not clear the ntuples after saving to output root file.

I also tried TTree::SetMaxTreeSize(N); with N being ~2 GB but this also does not seem to clear the TNtuples from memory. It only creates new root file and writes to those files.

I have also tried disabling the branches I do not need in combination to the above procedures but it does not help.

I also tried to use TNtuple->Reset() after writing events at certain interval but the problem is that it creates many TNtuples cycles in the file everytime I call TNtuple->Fill().

What I would like to do is: Write events to output root file periodically at certain interval(~100,000) and clear TNtuples in EventLoop (so the memory occupied by TNtuples is cleared and also there should be only one cycle of Tree in output files) and then continue with the event loop.

Thank you very much!

That should not be the case (it should only be every several to many calls to Fill) Can you share the actual code?

Hi,

Yes, my wording of the problem might be loose. Yes, the cycles are created on interval(cycle number = number of write intervals) I write the ntuples to ROOT file. But each cycle has different data. I looked around how I can merge several cycles of same Tree, I see the TTree->Merge() method, but this does not seem to work.

Here is my script. I use one script to create ntuples and another script was generated using MakeClass for which I defined my custom function to process events.

Thank you!