RDataFrame Filter on variable size vectors

Dear experts,

I am using RDataFrame to to a simple event selection. e.g. I want events passing the Filter with all leptons to have a pT > 27 GeV. But not every event has the same number of leptons. The pT information are stored in a std::vector.

I would like to have a lambda function for this, which can be called when doing the Filter operation, but I try to avoid a for loop over the size of each vector. Lookg at RVec and ROOT:VecOps, is there a better/faster method to do this?

Thank you and best regards!

I cannot run the code right now as many parts are not finished, but one idea that came to my mind now is:

ROOT::RDataFrame X(tree_name, file_name)
X.Define("good_leptons", "lep_pt > 27000.").Filter("ROOT::VecOps::Sum(good_leptons) == lep_pt.size()")

Does this make sense if I want all leptons in an event so have a pT higher than 27 GeV?

Hi @eneb ,

RDF reads collections such as std::vectors as RVecs. The RVec docs are at ROOT: ROOT::VecOps::RVec< T > Class Template Reference . Under “Reference for RVec helper functions” you can find many useful helper functions for these kind of operations.

You could spell e.g. Filter("All(lep_pt > 27000.)").


Hi @eguiraud ,

thanks a lot! I must missed it when looking through the reference guides!


Hi again @eguiraud,

does All take just one argument or could I also do

auto df_new = df.Filter("All(lep_pt>27000. && abs(lep_eta)<2.5)");


Hi @eneb ,

you can quickly try these things at the ROOT prompt, you don’t even need an RDF:

~ root -l
root [0] ROOT::RVecD v1{1.,2.,3.}
(ROOT::RVecD &) { 1.0000000, 2.0000000, 3.0000000 }
root [1] ROOT::RVecD v2{4.,5.,6.}
(ROOT::RVecD &) { 4.0000000, 5.0000000, 6.0000000 }
root [2] All(v1 > 1. && v2 < 5)
(bool) false

So that works, but because that’s still a single argument passed to All: the mask produced by the && operator.

Thanks for the very quick answer and the hint to use the root prompt. Will do this for further tests!

Thank you very much!


