Customize RDataFrame workflow with if statements for Define calls

RENATO_QUAGLIANI · June 16, 2020, 9:48am

Hi all,
I have a naive question about RDataFrame::Define and how to best keep track of it.
Let’s say i have a compiled code which is able to switch between behaviours.
I want to perform something along this line :

ROOT::RDataFrame df(...); 
if ( ConditionA(globalBehaviour))    df.Define("CC", customFunctor, {"inputSet"}; 
else ...  df.Define("CC", customFunctor2, {"otherinputset"};

and use the RNode produce after this switch.
All my workflow on Defines is sequential, therefore, I wonder if i can simply do something like this :

vector<ROOT::RDF::RNode> _nodes_define; 
if ( ConditionA(globalBehaviour))    _nodes_define.push_back( df.Define("CC", customFunctor, {"inputSet"} ) ; 
else ...  _nodes_define.push_back(df.Define("CC", customFunctor2, {"otherinputset"});
keep working on _nodes_define.back(); from here

Is there any issue in doing so? ( i wonder if one has to prefer emplace_back or push_back.
Thanks for any suggestion

StephanH · June 16, 2020, 11:10am

Hi Renato,

You can keep all these nodes around, but unless you want to branch off a different computation from one of these nodes, there’s no need to. Since all RDF nodes can be converted to an RNode, why not do this:

ROOT::RDF:RNode lastNode;
lastNode = A or B;
lastNode = C or D;

You can also write a function that returns RNodes, see here:
https://root.cern/doc/master/df025__RNode_8C.html

RENATO_QUAGLIANI · June 16, 2020, 11:26am

Hi @StephanH, what i fail to understand in your suggestion is how to deal with this use case:

ROOT::RDataFrame df(); 
if(  conditionA) { 
     myNode = df.OperationA();
    //.... many other operations ( i just do Define for the moment ) 
}else{ 
    myNode = df.OperationB();
}
//After this i want to use what myNode is containing. 
myNode.Snapshot()

Are you suggestig to do :

ROOT::RDataFrame df(); 
ROOT::RDF:RNode lastNode;
if(  conditionA) { 
     lastNode = df.OperationA();
    //.... many other operations ( i just do Define for the moment ) 
}else{ 
    lastNode = df.OperationB();
}
//--> use lastNode will absorb directly things, and no need of propagating vectors?

If that’s the case, thanks a lot! This is much easier than doing the vector of nodes propagation.

RENATO_QUAGLIANI · June 16, 2020, 11:27am

Is it allowed to do :

        lastNode = lastNode.Define("wL0_L0I_Input_VarY", _inputVarY.Data()) );

as well?

StephanH · June 16, 2020, 11:50am

Yes. The function on the right returns a node, and you can assign that to whatever you want. So you can do:

if(  conditionA) { 
     lastNode = df.OperationA();
     lastNode = lastNode. ....;
     lastNode = ....;//.... many other operations ( i just do Define for the moment ) 
}else{ 
    lastNode = df.OperationB();
}

eguiraud · June 16, 2020, 11:55am

(Also see this RNode tutorial)

StephanH · June 16, 2020, 11:57am

Yes, I see we think alike! I linked it in the first reply!

RENATO_QUAGLIANI · June 16, 2020, 12:10pm

Thanks a lot for all the suggestions
the

lastNode = lastNode.XXX

reduces the amount of text in the code i had by almost 50 % .
I will report if all go smoothly on my checks ( i am chaining something like 2-300 operations with this lastNode notation.

system · June 30, 2020, 12:10pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.