Problem callding RDataFrame::Define with ColumnNames_t with python

I have some troubles using python and RDataFrame. Here a simple example:

import ROOT

def fill_tree(treeName, fileName):
    df = ROOT.RDataFrame(50)
    df.Define("b1", "(double) rdfentry_")\
      .Define("b2", "(int) rdfentry_ * rdfentry_").Snapshot(treeName, fileName)

fileName = 'test.root'
treeName = 'myTree'
fill_tree(treeName, fileName)
d = ROOT.RDataFrame(treeName, fileName)

ROOT.gInterpreter.ProcessLine('''
class FoldedBinner{
public:
  FoldedBinner(const TAxis& axis1, const TAxis& axis2)
  : m_axis1(axis1), m_axis2(axis2) { }
  int operator()(double x, double y)
  {
    const auto bin1 = m_axis1.FindBin(x);
    const auto bin2 = m_axis2.FindBin(y);
    return bin1 * (m_axis1.GetNbins() + 1) + bin2;
  }
private:
  TAxis m_axis1, m_axis2;
};

FoldedBinner binner_cpp(TAxis(100, 0, 100), TAxis(100, 0, 100));
''')

binner = ROOT.FoldedBinner(ROOT.TAxis(100, 0, 100), ROOT.TAxis(100, 0, 100))

d.Define("folded_cpp", "binner_cpp(b1, b2)")
d.Define("folded", binner, ('b1', 'b2'))

I don’t like the solution in the last but one line, since I need to define the TAxis in the c++ side which is tedious and I need to pass floating point values as strings (I have unregular binning).

I would like to use a lambda with capture instead of a functor but ProcessLine fails.

The last line fails:

TypeError: can not resolve method template call for ‘Define’


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.18/04
Platform: Fedora
Compiler: gcc 9.2.1


Hi @wiso,
this problem is fixed with new PyROOT (was “experimental PyROOT”): current master and the upcoming v6.22 happily run your reproducer without errors.

In v6.18 you need the following workaround:

cols = ROOT.std.vector('std::string')()                                                                                 
cols.push_back('b1')                                                                                                    
cols.push_back('b2')                                                                                                    
d.Define("folded", binner, cols)           

I think the relevant jira ticket is https://sft.its.cern.ch/jira/browse/ROOT-10547

Cheers,
Enrico

Perfect, from the reference was not clear that ColumnNames_t can be from a std::vector<std::string>

ColumnNames_t is just an alias for std::vector<std::string>, but I see now that due to too many levels of indirection doxygen completely hides this fact. I’ll see what can be done about this.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.