Dear ROOT experts,
I am facing a problem related to the use of Proof Lite. Specifically, when I run my analysis code on a TChain
made by thousands of files, using Proof Lite to process the TChain
with a custom class inheriting from TSelector
, I see that the memory consumption of the slave processes (proofserv.exe
) increases while processing events, up to the point when all the memory available on my machine is used. To monitor memory use, I am using the “top” command, looking at the “RES” column. I can confirm that the memory is completely used also from the fact that, when this happens, the machine “freezes” and I have to kill the proovserv.exe
processes.
I provide a tar file with all the files I am using in the analysis, to have a working example to reproduce this behavior. The example can be found here: http://www.ge.infn.it/~celentan/example.tar - it also contains one of the files I am using as input in the analysis. I think that, to reproduce the issue, one can just copy this file ~ 100/200 times and then run the analysis code on these equal copies.
The example can be compiled with Make (the first time this has to be executed twice), and launched with
./ana -f path_to_one_or_more_input_files -o name_of_the_output_file -nproof NumberOfWorkers
- The input files contain different
TTree
objects, eachTTree
contains different branches, all branches are made byvector<double>
orvector<int>
. - The file
ana.cc
contains the main method of my analysis. I am first reading all the files to check their consistness (by looking at the last event in “header”TTree
), then I create differentTChain
, and I use theAddFriend
method to later access all the branches in the analysis. - I am creating a custom class
anaSelector
, inheriting fromTSelector
. The class is implemented in the two filesanaSelector.cc
andanaSelector.h
. I am usingSetBrancAddress
in theInit()
method andGetEntry
in theProcess
method to read data from the differentTFiles
. - Note that in the
Process()
method, after the call toGetEntry
I am immediately returning. This is telling me that the memory problem is related to the way data is read, and not to any subsequent operation I am doing on it. - In the
ana.cc
file I had to hard-code the location of the shared library containing the dictionary of theanaSelector
class. - I tried to add a method
clear_vector
that deletes all the pointers to the vectors I am using, and resets these to zero, calling it in theInit()
method, but this does not change the behavior of the code. Similarly, I added a call to this method in theProcess()
method, just before the return, but nothing changed. - To make sure the input files were not affected by the error described in this topic, before running the analysis code I re-created all of them using
hadd
.
Thanks,
Andrea
ROOT Version: 6.20.04
Platform: Linux CentOS7
Compiler: gcc 8.2.0