RDataFrame - Creating a new variable which is just the length of a vector branch in pyROOT

Hi

I have a very simple question with RDataFrames. I am trying to understand if I can extend a simple analysis framework to use RDataFrames, and one issue I’ve encountered is to access the number of elements in a vector branch. A variable like this might commonly be stored in the original TTree but it is also implicitly available if something like jet_pt or lep_pt is stored.

In the C++ ROOT interpreter, I can do something like this

rdf2 = rdf.Define("jet_n","jet_pt.size()")

but in pyROOT, this returns an error Error in <TBranch::TLeaf>: Illegal data type for jet_n/jet_n/ which suggests that pyROOT is not able to assign a type to this variable? I could not find documentation on enforcing a type.

I have identified that I can do the following in pyROOT successfully:

rdf2 = rdf.Define("jet_n", "ROOT::VecOps::Sum(jet_pt > 0)")

but this feels excessive to filter and sum the passing jets (which will be all of them anyway). I was wondering if anyone has any input on this?

Thanks
Ian

ROOT Version: 6.20/06
Built for linuxx8664gcc on Jun 10 2020, 06:10:57
From tags/v6-20-06@v6-20-06

Hi,
I guess there is a Snapshot downstream of that Define?
That’s a bug that we fixed in recent ROOT versions (I don’t remember whether it was 6.22 or 6.24 but I can check the release notes if you want) for which TTree did not support branches of type std::size_t (so Snapshot cannot write out jet_n).

If possible, I would suggest updating to v6.24 to also benefit from all other bug fixes and performance improvements we introduced since v6.20. If that’s not an option, a workaround is df.Define("jet_n", "int(jet_pt.size())").

Cheers,
Enrico

Hi Enrico

Yes sorry there is a Snapshot later in the code. Thanks for the follow up for a workaround, and that this is fixed!

Cheers,
Ian

Alright! (without Snapshot-ing the column out you can use columns of type std::size_t without issues)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.