I am applying some filters and defining some variables. My question is if the order matter concerning the result?
For example:
Is rdf.Filter(good_pt.size()>100, “event_cut”).Define(“good_pt”, pt[cuts]) the same as rdf.Define(“good_pt”, pt[cuts]).Filter(good_pt.size()>100, “event_cut”) ??
Hi @imanol097 ,
as the Filter uses the good_pt column, you have to Define it before. With a recent enough ROOT version you will get an “unknown column” error otherwise.
I am not sure I understand the question. Filter does not filter variables, it filters events. After the Filter call, only events that satisfy good_pt.size() > 100 will be processed.
good_pt will be calculated for every event, as it’s needed by the Filter. Then the Filter will be evaluated and if good_pt.size() > 100 whatever you put after the Filter will execute, otherwise it will not.
EDIT: more in detail, Defines are evaluated at most once per event, only if something needs that value (in this case, the Filter).
I’m probably missing something: leaving aside Define and Filter for a second, how can you evaluate good_pt.size() > 100 without first evaluating good_pt? Seems impossible, independently of RDataFrame.