PyROOT & RDataFrame: Filter-function does not return bool

Hi,

I tried using RDataFrame in PyROOT using ROOT 6.14.04.
Running on lxplus using: ’ lsetup “root 6.14” ’

My code:

import ROOT

datatree = ROOT.TChain("trees_SRDV_")
datatree.Add("path/trees.root")

cut_volume = "DV_rxy<=300 && DV_rxy>10"
cut_quality = "DV_m<2.5 && DV_chisqPerDoF<3"
cut_kShort = "(DV_m>0.52 || DV_m<0.47 || DV_nTracks>2)"

rdf = ROOT.RDataFrame(datatree)
rdf2 = rdf.Filter(cut_volume,"Fiducial Volume").Filter(cut_quality,"Quality").Filter(cut_kShort, "K_short")

rep = rdf.Report()
rep.Print()

When running this the code crashes on the line “rep.Report()” However, the issue lies in rdf2.Filter() as it runs fine if this is commented away. Notice that the report is not based on the unfiltered rdf. Here is the output from running the code:

In file included from /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.14.04/src/ROOT-6.14.04-build/input_line_12:21:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../etc/dictpch/allHeaders.h:688:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/TTreeAsFlatMatrix.h:17:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDataFrame.hxx:26:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/include/ROOT/RDFInterface.hxx:32:
/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDFInterfaceUtils.hxx:210:4: error: static_assert failed "filter functions must return a bool"
   static_assert(std::is_same<FilterRet_t, bool>::value, "filter functions must return a bool");
   ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDFInterfaceUtils.hxx:296:17: note: in instantiation of function template specialization 'ROOT::Internal::RDF::CheckFilter<(lambda at input_line_84:2:39)>' requested
      here
   RDFInternal::CheckFilter(f);
                ^
input_line_84:2:23: note: in instantiation of function template specialization 'ROOT::Internal::RDF::JitFilterHelper<(lambda at input_line_84:2:39),
      ROOT::Detail::RDF::RLoopManager>' requested here
 ROOT::Internal::RDF::JitFilterHelper([](ROOT::VecOps::RVec<double>& DV_rxy){return DV_rxy<=300 && DV_rxy>10
                      ^
In file included from /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc62binutils/LABEL/slc6/build/projects/ROOT-6.14.04/src/ROOT-6.14.04-build/input_line_12:21:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../etc/dictpch/allHeaders.h:688:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/TTreeAsFlatMatrix.h:17:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDataFrame.hxx:26:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/include/ROOT/RDFInterface.hxx:32:
In file included from /cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDFInterfaceUtils.hxx:18:
/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDFNodes.hxx:673:14: error: no viable conversion from returned value of type 'ROOT::VecOps::RVec<int>' to function return type 'bool'
      return fFilter(std::get<S>(fValues[slot]).Get(entry)...);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDFNodes.hxx:661:27: note: in instantiation of function template specialization 'ROOT::Detail::RDF::RFilter<(lambda at input_line_84:2:39),
      ROOT::Detail::RDF::RLoopManager>::CheckFilterHelper<0>' requested here
            auto passed = CheckFilterHelper(slot, entry, TypeInd_t());
                          ^
/cvmfs/sft.cern.ch/lcg/releases/ROOT/6.14.04-0d8dc/x86_64-slc6-gcc62-opt/etc/../include/ROOT/RDFNodes.hxx:644:4: note: in instantiation of member function 'ROOT::Detail::RDF::RFilter<(lambda at input_line_84:2:39),
      ROOT::Detail::RDF::RLoopManager>::CheckFilters' requested here
   RFilter(FilterF &&f, const ColumnNames_t &bl, PrevDataFrame &pd, std::string_view name = "")

Just running datatree.Draw("DV_rxy, cut_x) produces the expected result.

Cheers
Filip

Hello,
we are stricter than TTree::Draw: the expression that you pass to a Filter must be a bool (a quantity that is convertible to bool is not enough). I guess that one among “Fiducial Volume”, “Quality” or “K_short” is not a bool (but you can get a bool by filtering with "Quality > 0" or similar.

Also, "Fiducial Volume" has a space in between the column name – I’m not sure how well that is supported, you might want to do an Alias("fid_volume", "Fiducial Volume") and then use "fid_volume" in those kind of expressions, or do Filter([](double fid_volume) { return fid_volume > 0; }, {"Fiducial Volume"}) where it’s clear that "Fiducial Volume" is a single column name.

Hope this helps!
Cheers,
Enrico

Hi,
This has now been solved. The issue was that the branches used in the cuts were actually vectors. This was solved in TTree.Draw() by some magic, but for this I have replaced the cut_volume, cut_kShort and cut_volume strings by lambda functions which go through the vectors and does the test separately for each index. So the code works with the following snippet for the cuts instead.

cut_volume = "for (auto x : DV_rxy){ if (x<=300 && x>10) return true; } return false;"
cut_quality = "for (int i=0;i<DV_n;i++){ if (DV_m[i]<2.5 && DV_chisqPerDoF[i]<3) return true;} return false; "
cut_kShort = "for (int i=0;i<DV_n;i++){ if(DV_m[i]>0.52 || DV_m[i]<0.47 || DV_nTracks[i]>2) return true; } return false; "

Thank you!
/Filip

Hi Filip,

great to hear that this is solved.
I’d like to give a heads-up of what will be available in ROOT 6.16 (due in November).
The collections you are treating are handled by RDataFrame as RVec instances: some “magic” is there. For example you can use the operators “<=”, “<” and co.
In ROOT 6.16 the conversion to individual bools will be very easy thanks to additional helpers such as All:

bool mnyResult = All(DV_rxy > 0);

(and this will be vectorised internally without the user needing to do anything).

Now, if you want, already in 6.14, the release you are using, you can simplify a bit the code. For example like this (I take your “cut_volume” as an example):

.Filter("! DV_rxy[DV_rxy<=300 && DV_rxy>10].empty()")...

Here you can find a nice tutorial about that syntax (https://root.cern.ch/doc/v614/vo003__LogicalOperations_8C.html). And again, the system will vectorise the operations for you whenever possible (i.e. faster code without you doing anything!)

I hope this helps.

Cheers,
D

PS
I think your questions qualify for the regular forum, they are not simple at all :slight_smile:

@filbab do you have objections against moving this thread to the ROOT section? There is some good content here…

Thank you very much for the response! And please feel free to move the thread to the ROOT section.
/Filip

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.