ROOT Version: 6.26/04 (from /cvmfs/sft.cern.ch/lcg/views/LCG_102/x86_64-centos7-gcc8-opt
)
Python Version: 3.11.2
Platform: CentOS Linux release 7.9.2009 (Core)
Hi! I’m kind of new to ROOT and pyROOT, but theses past few months I’ve been familiarizing myself with the framework and I’ve arrived to a script that does everything I need to do: read a TTree from a TFile, create a new TTree in a new TFile and compute some TBranches from the TBranches in the original TFile and writing them to the new TTree.
However, I’ve come to the realisation that this script is using enormous amounts of memory (the jobs in Condor end up using ~100GB of memory when computing 20-30 new branches in 300k entries files). This happens too in local.
Code structure
In detail, what I’ve done with the code is:
- I’ve written a series of
.cpp
that look like this:
// project/macros/macro.cpp
#include <vector>
vector<float> macro(
vector<float> value1,
vector<float> value2
) {
vector<float> result;
// compute and push back values for result
return result;
}
- I’ve wrapped the C++ macros with Python functions so I can call them as part of a package:
# project/src/module/macros/macro.py
import ROOT
ROOT.gROOT.LoadMacro('project/macros/macro.cpp')
def macro(tree, *args, **kwargs):
return ROOT.macro(
getattr(tree, 'value1'), #branch with vector of floats
getattr(tree, 'value2') #branch with vector of floats
)
- I’ve built a Python class that handles dynamic creation of branches from these macros:
# project/src/classes/reTupler.py
class reTupler:
def __init__(self, tree_name, new_file, src_file):
self.src_file = ROOT.TFile.Open(src_file)
self.src_tree = self.src_file.Get(tree_name)
self.new_file = ROOT.TFile.Open(new_file,'recreate')
self.new_tree = ROOT.TTree(tree_name, tree_name)
# To access branches in 'src_tree' from 'new_tree':
self.new_tree.AddFriend(self.src_tree)
# To keep track of new branches and store values:
self.new_branches = {}
def add_branch(self, name, f, value_type='float'):
self.new_branches[name] = {}
self.new_branches[name]['f'] = f
self.new_branches[name]['name'] = name
self.new_branches[name]['value_type'] = value_type
self.new_branches[name]['value'] = value = ROOT.std.vector(value_type)()
self.new_branches[name]['tbranch'] = self.new_tree.Branch(name, value)
def run(self):
nentries = self.src_tree.GetEntries()
for i in range(nentries):
# Get entry and make sure src_tree and new_tree are synced
self.src_tree.GetEntry(i)
self.new_tree.GetEntry(i)
# Now loop on all the branches that have been added:
for branch_name, branch_dict in self.new_branches.items():
branch_dict['value'].clear()
[branch_dict['value'].push_back(result) for result in branch_dict['f'](self.new_tree)]
# Fill entry with all computed branches
self.new_tree.Fill()
self.new_tree.Write()
self.new_file.Close()
self.src_file.Close()
In my code I distinguish between vector and scalar branches, but here for simplicity I’ve written only the vector case.
Minimal Working Example
I’ll make a minimal working example so anyone can reproduce this issue:
import ROOT
ROOT.gInterpreter.Declare('''
// project/macros/macro.cpp
#include <vector>
vector<float> macro(
vector<float> value1,
vector<float> value2
) {
vector<float> result;
// Copy values from value2 according to the sign of value1
// Doing nothing also increases the memory usage...
for (int i=0; i < value1.size(); i++) {
if (value1[i] > 0) {result.push_back(value2[i]);};
};
return result;
}
''')
def macro(tree, *args, **kwargs):
return ROOT.macro(
getattr(tree, 'value1'), #branch with vector of floats
getattr(tree, 'value2') #branch with vector of floats
)
class reTupler:
def __init__(self, tree_name, new_file, src_file):
self.src_file = ROOT.TFile.Open(src_file)
self.src_tree = self.src_file.Get(tree_name)
self.new_file = ROOT.TFile.Open(new_file,'recreate')
self.new_tree = ROOT.TTree(tree_name, tree_name)
# To access branches in 'src_tree' from 'new_tree':
self.new_tree.AddFriend(self.src_tree)
# To keep track of new branches and store values:
self.new_branches = {}
def add_branch(self, name, f, value_type='float'):
self.new_branches[name] = {}
self.new_branches[name]['f'] = f
self.new_branches[name]['name'] = name
self.new_branches[name]['value_type'] = value_type
self.new_branches[name]['value'] = value = ROOT.std.vector(value_type)()
self.new_branches[name]['tbranch'] = self.new_tree.Branch(name, value)
def run(self):
nentries = self.src_tree.GetEntries()
for i in range(nentries):
# Get entry and make sure src_tree and new_tree are synced
self.src_tree.GetEntry(i)
self.new_tree.GetEntry(i)
# Now loop on all the branches that have been added:
for branch_name, branch_dict in self.new_branches.items():
branch_dict['value'].clear()
[branch_dict['value'].push_back(result) for result in branch_dict['f'](self.new_tree)]
# Fill entry with all computed branches
self.new_tree.Fill()
self.new_tree.Write()
self.new_file.Close()
self.src_file.Close()
tree_name = 'DDTree'
src_path = 'path/to/src.root'
new_path = 'path/to/new.root'
retupler = reTupler('DDTree', new_path, src_path)
retupler.add_branch('new_branch', macro, 'float')
retupler.run()
This is a very reduced version of the code that is similar enough to the one I am currently using and it also presents the memory leak.
For this minimal working example I’ve used a .root
with 100k entries and only two branches (‘value1’ and ‘value2’) that I’ve previously filled with 20-element vectors with random numbers from -999 to 999 using numpy.randrom.uniform
(numpy.randrom.uniform(-999,999,20)
). Just in case, I’ve used the following code:
import ROOT
import numpy as np
file_path = 'path/to/src.root'
file = ROOT.TFile.Open(file_path,'recreate')
tree = ROOT.TTree('DDTree','DDTree')
# Branch: value1
value1_value = ROOT.std.vector('float')()
value1_branch = tree.Branch('value1',value1_value)
# Branch: value2
value2_value = ROOT.std.vector('float')()
value2_branch = tree.Branch('value2',value2_value)
value_length = 20
nentries = 100000
for i in range(nentries):
tree.GetEntry(i)
value1_value.clear()
[value1_value.push_back(result) for result in np.random.uniform(-999,999,value_length)]
value2_value.clear()
[value2_value.push_back(result) for result in np.random.uniform(-99,999,value_length)]
tree.Fill()
file.Write()
file.Close()
This code doesn’t present a memory leak (thankfully) (:
Fixes I’ve tried (and don’t seem to work)
I have tried different fixes that I’ve gathered from past topics and from recommendations from my colleagues but none seem to solve my issue. The ones I’ve tried so far are:
- Using
self.new_tree.FlushBaskets(); self.src_tree.DropBaskets()
in regular invertals through thefor
loop. - Calling the garbage collector with
gc.collect()
in regular intervals through thefor
loop. - Using
self.new_tree.DropBuffers(max_memory); self.src_tree.DropBuffers(max_memory)
with different values formax_memory
. - Setting explicitly the AutoSave with
self.new_tree.SetAutoSave(step)
where step is a high enough number (~10k) so that the runtime isn’t increased significantly. - Setting the address for the branches from
self.src_tree
creating a dictionary for source branches similar to the one for new branches (self.new_branches
) and explicitly sayingself.out_branch.SetBranchAddress(name, value)
. - In addition to the last fix, setting a fixed vector length for the branches high enough for the number of elements using
SetBranchAddress
’s third argument. - Using pointers for the macros in C++ instead of the values (can be done with the current code, just needs to re-write
macro.cpp
to use pointers). - Using Python 2 (Python Version: 2.7.5).
None of these solutions have worked so far for me, however, my implementation may not be perfect and some may actually work, this is just a list of the fixes I’ve tried so far.
Conclusion
This is my first time using ROOT, pyROOT and C++ and I may be doing something that is inadvertently causing this memory leak. I’ve been struggling with this for a month and I’d really like to progress with this issue, so any help anyone can provide will be greatly appreciated (: