I am moving from root_numpy to RDataFrame and i am having some difficulties in the TTree filtering.
I have a lot of variables in my trees, some of then are event variables and other are track variables, and i need to filter both kind of variables at the same time.
My problem is filtering tracks within an event.
How can i filter track variables, as eta or pt, easily in PyROOT ? For example selecting tracks with pT > 700MeV ?
Currently i am using C++ lambda functions like:
"for (auto x :" + str(variable) + "){ if (x<=" + str(max) + "&& x>" + str(min) + ") return true; } return false;"
But this is not so readable as i would like, is there any other solution ?
__Please read tips for efficient and successful posting and posting code
_ROOT Version: 6.24
Platform: Not Provided
Compiler: Not Provided
Welcome to the ROOT Forum! @eguiraud, our RDataFrame expert is currently on vacation, but maybe @etejedor can have an idea, since it’s PyROOT…
Hello,
One thing you can do, if expressing everything in a string in Filter is less readable, is to define a function beforehand:
ROOT.gInterpreter.Declare("""
bool my_filter_function(branch_type1 branch_name1, ...) {
// your code here
}
""")
and then, in Filter, you do:
rdf.Filter("my_filter_function(branch_name1, ...)")
Alternatively, you can also put your C++ functions in a compiled library and load it, like it is explained here:
Cheers,
Enric
Hi,
You can also use RVec in the JITted C++ (see the example in the documentation), I think Track_pt > 700. && Track_pt < X constructs an RVec<bool> that you can use to index any other Track_XYZ (for how to scale to a large number of branches the discussion Is there better way to filter array branches than defining new columns in RDataFrame? is a nice overview of the possibilities and limitations).
Cheers,
Pieter
@pieterdavid, Could you please give some example of the PyROOT syntax?
I’ve tried it but is not working.
I have seen some examples where syntax like .Filter(“eta>2”) works. How can be this possible?
If you are using a string to define your cut that you pass as an argument to Filter, there is no difference between Python and C++ (the expression in the string is just in time compiled as C++ code in both cases).
What example are you referring to?
Filter("eta>0") just works if eta is a scalar branch. If eta is an array/vector branch, the right filter would be Filter("All(eta>0)"), since the result of eta>0 would be an RVec<bool> as explained by @pieterdavid - that would filter out the events with all etas <=0. Here’s a tutorial that shows the logical operations you can do:
https://root.cern/doc/master/vo003__LogicalOperations_8C.html