Problem callding RDataFrame::Define with ColumnNames_t with python

I have some troubles using python and RDataFrame. Here a simple example:

import ROOT

def fill_tree(treeName, fileName):
    df = ROOT.RDataFrame(50)
    df.Define("b1", "(double) rdfentry_")\
      .Define("b2", "(int) rdfentry_ * rdfentry_").Snapshot(treeName, fileName)

fileName = 'test.root'
treeName = 'myTree'
fill_tree(treeName, fileName)
d = ROOT.RDataFrame(treeName, fileName)

ROOT.gInterpreter.ProcessLine('''
class FoldedBinner{
public:
  FoldedBinner(const TAxis& axis1, const TAxis& axis2)
  : m_axis1(axis1), m_axis2(axis2) { }
  int operator()(double x, double y)
  {
    const auto bin1 = m_axis1.FindBin(x);
    const auto bin2 = m_axis2.FindBin(y);
    return bin1 * (m_axis1.GetNbins() + 1) + bin2;
  }
private:
  TAxis m_axis1, m_axis2;
};

FoldedBinner binner_cpp(TAxis(100, 0, 100), TAxis(100, 0, 100));
''')

binner = ROOT.FoldedBinner(ROOT.TAxis(100, 0, 100), ROOT.TAxis(100, 0, 100))

d.Define("folded_cpp", "binner_cpp(b1, b2)")
d.Define("folded", binner, ('b1', 'b2'))

I don’t like the solution in the last but one line, since I need to define the TAxis in the c++ side which is tedious and I need to pass floating point values as strings (I have unregular binning).

I would like to use a lambda with capture instead of a functor but ProcessLine fails.

The last line fails:

TypeError: can not resolve method template call for ‘Define’


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.18/04
Platform: Fedora
Compiler: gcc 9.2.1


Hi @wiso,
this problem is fixed with new PyROOT (was “experimental PyROOT”): current master and the upcoming v6.22 happily run your reproducer without errors.

In v6.18 you need the following workaround:

cols = ROOT.std.vector('std::string')()                                                                                 
cols.push_back('b1')                                                                                                    
cols.push_back('b2')                                                                                                    
d.Define("folded", binner, cols)           

I think the relevant jira ticket is https://sft.its.cern.ch/jira/browse/ROOT-10547

Cheers,
Enrico

Perfect, from the reference was not clear that ColumnNames_t can be from a std::vector<std::string>

ColumnNames_t is just an alias for std::vector<std::string>, but I see now that due to too many levels of indirection doxygen completely hides this fact. I’ll see what can be done about this.