Trigger several times event loops with RDataFrame

Dear ROOT experts,
I have a code which was written like this

TChain *chain = GetChainFromSomewhere() ; 
for( slot_computation : CONFIGURATIONS_SETTINGS){
    ROOT::EnableImplicitMT( 8) ; 
    ROOT::RDataFrame df( *chain); 
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   *df.Count(); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

With this configuration of the code , after i am iterating for 1-2 slot_computation , the code breaks and it cnsumes an enourmous amount of memory.

If i refactor the code to do this :

TChain *chain = GetChainFromSomewhere() ; 
ROOT::RDataFrame df( *chain); 
for( slot_computation : CONFIGURATIONS_SETTINGS){
    ROOT::EnableImplicitMT( 8) ; 
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   *df.Count(); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

The code consumes much less memory and it completes successfully.

In practice, i know i would need to refactor the code to “bookkep” everything once, and just do 1 event loop, but it would require quite some time to refactor containers of results to have an extra bookkeping level in my code.

I wonder if this behaviour is expected.
Cheers
Renato


Please read tips for efficient and successful posting and posting code

_ROOT Version: 6.18/04
Platform: x86_64-centos7-gcc8-opt
Compiler: linuxx8664gcc


You should probably take a look at what is consuming the memory. You can run your program under valgrind --tool=massif and check its output for instance.

But it also runs on a single thread, right? EnableImplicitMT should be called before the RDF object is constructed (with more recent ROOT versions you should get a warning).
Also I’m not sure what purpose it serves to EnableImplicitMT and DisableImplicitMT in a loop, in the end all event loops run multi-threaded (i.e. they shuffle entries).

You should see a reduced amount of memory used if you refactor this way, which runs a single event loop:

TChain *chain = GetChainFromSomewhere() ; 
ROOT::EnableImplicitMT( 8) ; 
ROOT::RDataFrame df( *chain); 
std::vector<ROOT::RDF::RResultPtr<ULong64_t>> counts;
for( slot_computation : CONFIGURATIONS_SETTINGS){
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   counts.emplace_back(df.Count()); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

counts[0].GetValue();  // trigger the event loop

If that’s not an option, in 6.22 you should still see a reduced memory usage if you do it like this:

TChain *chain = GetChainFromSomewhere() ; 
std::vector<ROOT::RDF::RResultPtr<ULong64_t>> counts;
for( slot_computation : CONFIGURATIONS_SETTINGS){
    ROOT::EnableImplicitMT( 8) ; 
    ROOT::RDataFrame df( *chain); 
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   counts.emplace_back(df.Count()); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

// trigger all event loops
for (auto &count : counts)
  *count;

and in the upcoming 6.24, even if you have different RDataFrames like in the last example you can run all of their separate event loops concurrently substituting

std::vector<ROOT::RDF::RResultPtr<ULong64_t>> counts;
...
for (auto &count : counts)
  *count;

with

std::vector<ROOT::RDF::RResultHandle> counts;
...
ROOT::RDF::RunGraphs(counts);

but that does not necessarily reduce memory consumption.

Still, best thing is to measure what’s taking up memory.
Cheers,
Enrico

I do this because after i disable it i collect
epsNumerator =df.Filter(x).Take(RVec coulumn)
epsDenominator=df.Filter(y).Take(RVec column2)

I know in advance that my each single column vector has 100 entries, and i snapshot to disk a new ntuple using RDataframe and i want the entires to match the [i] of the entries column i extracted. If i don’t disable it, the indexes get shuffled and this is something i need to avoid (without entering in details, i bootstrap efficiencies results, and i expect to estimate correlations, so i need to keep indexes ordered l

The main reason why i cannot refactor better the code is that i have not yet managed to implement something allowing to do a Sum<RVec>( vectorcolumn)

I.e if i have a 100 variation of my weight attached to the ntuple i want to extract the sum[i] of each. I wrote the code like this because i am currently Taking the vector column and summing by hand. I will post here the helper i will implement for that.

I can refactor my code avoiding to Take<RVec<double>> columns in each loop using what i posted in https://root-forum.cern.ch/t/sum-for-each-rvec-column-returning-an-rvec/42259.
Thanks @eguiraud for the suggested refactoring, i will go in that direction, but i need to have the

RVec<double> sumWColumns = df.Sum< RVec<double> >("vectorColumn"); 

Implemented before doing it.

Refactoring the code as you suggested and using the Reduce calls to gather sum of vector column as vector of double, helped to get rid of many Take<> i had for an overall factor 100 speed up. Indeed triggering event loops only once and work out a solution where one can avoid to Take columns for postprocessing is much faster.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.