RDataFrame Acessing ForEach in pyroot


_ROOT Version:409f3e1198 (current master)
_Platform:CC7
_Compiler:gcc4.8


Hi, this is related to the post:

I tried out the PyROOT experimental where it appears, but could not get it to run.

This is what I ran:

import ROOT
# Enable multi-threading
ROOT.ROOT.EnableImplicitMT()

header_path = "example_helper.h"

ROOT.gInterpreter.Declare('''
#include "{}"
'''.format(header_path))

# Create dataframe from NanoAOD files
files = ROOT.std.vector("string")(1)
files[0] = "root://eospublic.cern.ch//eos/root-eos/cms_opendata_2012_nanoaod/Run2012B_DoubleMuParked.root"
df = ROOT.ROOT.RDataFrame("Events", files)

for col in df.GetColumnNames():
    print(col, df.GetColumnType(col))

col="Muon_pt"
mu_col=df.Take[ROOT.VecOps.RVec[float]](col).GetValue()
print(type(mu_col), type(mu_col[0]))
df.Foreach(col,ROOT.print_rvec[float](mu_col[0]) )

with this as example_helper.h

#include "ROOT/RVec.hxx"

using namespace ROOT::VecOps;

template <typename T>
void print_rvec(const RVec<T>& v1){
  const size_t size = v1.size();
  for (size_t i = 0; i < size; i++) {
    std::cout<<v1[i]<<std::endl;
  }
  return;
}

And the output is:

run Int_t
luminosityBlock UInt_t
.....
Muon_pt ROOT::VecOps::RVec<Float_t>
....
Electron_dzErr ROOT::VecOps::RVec<Float_t>
<class cppyy.gbl.std.vector<ROOT::VecOps::RVec<float> > at 0x5598412eb110> <class cppyy.gbl.ROOT.VecOps.RVec<float> at 0x5597fac34440>
17.8546
8.47252
6.28775
Traceback (most recent call last):
  File "example.py", line 22, in <module>
    df.Foreach(col,ROOT.print_rvec[float](mu_col[0]) )
TypeError: Template method resolution failed:
  Failed to instantiate "Foreach(std::string,NoneType)"

So it triggers the print_rvec function once, but I assume when there is an empty Muon_pt, it can not cope with the Null pointer passed.

Is there a way to catch this?

Best,
Klaas

I guess @etejedor and @eguiraud can help you.

Hi,
currently Foreach is not supported in PyROOT as far as I know. The relevant issue is https://sft.its.cern.ch/jira/browse/ROOT-10246, you can comment there to let us know that it’s an important feature for you.

What’s happening in df.Foreach(col,ROOT.print_rvec[float](mu_col[0]) ) is that you are calling print_rvec right there and then, passing mu_col[0] as argument, and that’s why you see it printing. Then the result of that call to print_rvec is passed to df.Foreach, triggering the error you see.

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.