Bug in RDF::RNode runtime error on variables with similar names


We have noticed a bug in ROOT 6.20/06 very similar to Filter action in RDataFrame: JIT error on variables with similar names when trying to multiply columns with similar variable names (weight*weightFJvt), but for a ROOT::RDF::RNode.

The following snippet fails

ROOT::RDF::RNode df_with_defines(df_out);
df_with_defines = df_with_defines.Define(varName, var);

where varName = “weight” and var = “HGamEventInfoAuxDyn.weight*HGamEventInfoAuxDyn.weightFJvt”

With the error as follows:

input_line_181:2:45: error: use of undeclared identifier ‘__rdf_arg_HGamEventInfoAuxDyn_weightFJvt’
return __rdf_arg_HGamEventInfoAuxDyn_weight*__rdf_arg_HGamEventInfoAuxDyn_weightFJvt
terminate called after throwing an instance of ‘std::runtime_error’
what(): Cannot interpret the following expression:

Make sure it is valid C++.
Aborted (core dumped)

The code runs with no issue when “var"is either variable individually, e.g. “HGamEventInfoAuxDyn.weightFJvt” or” HGamEventInfoAuxDyn.weight". It also works with no problem when “var” is a combination of other weights whose names do not contain eachother, e.g. “HGamEventInfoAuxDyn.crossSectionBRfilterEff*HGamEventInfoAuxDyn.weightFJvt” ,

Please advise us on how we can get around this problem without changing our variable names. Thank you!

ROOT Version: 6.20/06
Platform: lxplus
Built for linuxx8664gcc
Using asetup 21.2.149,AnalysisBase

I guess @eguiraud can help.

thanks for reporting the issue, it looks like the problem is that one variable name is contained in the other and they contain a dot so we need to perform a string substitution when generating the corresponding C++ code. It’s definitely a bug.

Using an alias might work around the issue:

df.Alias("FJvt", "HGamEventInfoAuxDyn.weightFJvt")
  .Define("..", "HGamEventInfoAuxDyn.weight*FJvt")

Could you please verify whether the problem is still present in v6.22?



As far as I can tell things are fine in 6.22, if that’s not the case please share a minimal reproducer so I can debug the problem. I tried with the repro below and it runs fine:

#include <ROOT/RDataFrame.hxx>
#include <TInterpreter.h>
#include <TFile.h>
#include <TTree.h>
#include <iostream>

struct A {
   int a;
   int ab;

int main() {
   gInterpreter->Declare("struct A { int a; int ab; };");

     TFile f("f.root", "recreate");
     TTree t("t", "t");
     A *a = new A{2, 21};
     t.Branch("obj", &a);

   // prints 42
   std::cout << ROOT::RDataFrame("t", "f.root").Define("x", "obj.a*obj.ab").Max<int>("x").GetValue() << '\n';

   return 0;

Hi Enrico,

Thank you for the speedy reply. As far as I can tell it doesn’t look like ROOT 6.22 is supported in AnalysisBase yet, so I wasn’t able to test that. But the workaround you suggested using df.Alias(), solved the problem for now!

All the best,


