Trigger several times event loops with RDataFrame

RENATO_QUAGLIANI · November 11, 2020, 3:30pm

Dear ROOT experts,
I have a code which was written like this

TChain *chain = GetChainFromSomewhere() ; 
for( slot_computation : CONFIGURATIONS_SETTINGS){
    ROOT::EnableImplicitMT( 8) ; 
    ROOT::RDataFrame df( *chain); 
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   *df.Count(); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

With this configuration of the code , after i am iterating for 1-2 slot_computation , the code breaks and it cnsumes an enourmous amount of memory.

If i refactor the code to do this :

TChain *chain = GetChainFromSomewhere() ; 
ROOT::RDataFrame df( *chain); 
for( slot_computation : CONFIGURATIONS_SETTINGS){
    ROOT::EnableImplicitMT( 8) ; 
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   *df.Count(); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

The code consumes much less memory and it completes successfully.

In practice, i know i would need to refactor the code to “bookkep” everything once, and just do 1 event loop, but it would require quite some time to refactor containers of results to have an extra bookkeping level in my code.

I wonder if this behaviour is expected.
Cheers
Renato

Please read tips for efficient and successful posting and posting code

_ROOT Version: 6.18/04
Platform: x86_64-centos7-gcc8-opt
Compiler: linuxx8664gcc

eguiraud · November 11, 2020, 5:14pm

You should probably take a look at what is consuming the memory. You can run your program under valgrind --tool=massif and check its output for instance.

But it also runs on a single thread, right? EnableImplicitMT should be called before the RDF object is constructed (with more recent ROOT versions you should get a warning).
Also I’m not sure what purpose it serves to EnableImplicitMT and DisableImplicitMT in a loop, in the end all event loops run multi-threaded (i.e. they shuffle entries).

You should see a reduced amount of memory used if you refactor this way, which runs a single event loop:

TChain *chain = GetChainFromSomewhere() ; 
ROOT::EnableImplicitMT( 8) ; 
ROOT::RDataFrame df( *chain); 
std::vector<ROOT::RDF::RResultPtr<ULong64_t>> counts;
for( slot_computation : CONFIGURATIONS_SETTINGS){
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   counts.emplace_back(df.Count()); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

counts[0].GetValue();  // trigger the event loop

If that’s not an option, in 6.22 you should still see a reduced memory usage if you do it like this:

TChain *chain = GetChainFromSomewhere() ; 
std::vector<ROOT::RDF::RResultPtr<ULong64_t>> counts;
for( slot_computation : CONFIGURATIONS_SETTINGS){
    ROOT::EnableImplicitMT( 8) ; 
    ROOT::RDataFrame df( *chain); 
    RNode lastNode = df.Define("DUMMY","1>0");
    for( xx: SEVERAL SUBSETUPS){ 
      lastNode = lastNode.XXX() ; 
   } 
   counts.emplace_back(df.Count()); 
   //Store results for this configuration
   ROOT::DisableIMplicitMT() ; //Neede to store in a sorted way with RDataFrame some results i produced
}

// trigger all event loops
for (auto &count : counts)
  *count;

and in the upcoming 6.24, even if you have different RDataFrames like in the last example you can run all of their separate event loops concurrently substituting

std::vector<ROOT::RDF::RResultPtr<ULong64_t>> counts;
...
for (auto &count : counts)
  *count;

with

std::vector<ROOT::RDF::RResultHandle> counts;
...
ROOT::RDF::RunGraphs(counts);

but that does not necessarily reduce memory consumption.

Still, best thing is to measure what’s taking up memory.
Cheers,
Enrico

RENATO_QUAGLIANI · November 11, 2020, 8:13pm

I do this because after i disable it i collect
epsNumerator =df.Filter(x).Take(RVec coulumn)
epsDenominator=df.Filter(y).Take(RVec column2)

I know in advance that my each single column vector has 100 entries, and i snapshot to disk a new ntuple using RDataframe and i want the entires to match the [i] of the entries column i extracted. If i don’t disable it, the indexes get shuffled and this is something i need to avoid (without entering in details, i bootstrap efficiencies results, and i expect to estimate correlations, so i need to keep indexes ordered l

RENATO_QUAGLIANI · November 11, 2020, 8:26pm

The main reason why i cannot refactor better the code is that i have not yet managed to implement something allowing to do a Sum<RVec>( vectorcolumn)

I.e if i have a 100 variation of my weight attached to the ntuple i want to extract the sum[i] of each. I wrote the code like this because i am currently Taking the vector column and summing by hand. I will post here the helper i will implement for that.

RENATO_QUAGLIANI · November 12, 2020, 8:00am

I can refactor my code avoiding to Take<RVec<double>> columns in each loop using what i posted in https://root-forum.cern.ch/t/sum-for-each-rvec-column-returning-an-rvec/42259.
Thanks @eguiraud for the suggested refactoring, i will go in that direction, but i need to have the

RVec<double> sumWColumns = df.Sum< RVec<double> >("vectorColumn");

Implemented before doing it.

RENATO_QUAGLIANI · November 14, 2020, 9:24pm

Refactoring the code as you suggested and using the Reduce calls to gather sum of vector column as vector of double, helped to get rid of many Take<> i had for an overall factor 100 speed up. Indeed triggering event loops only once and work out a solution where one can avoid to Take columns for postprocessing is much faster.

system · November 28, 2020, 9:26pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.