RDataFrame, lambda capture by value, and "static" behavior


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.30.06
Platform: macosxarm64
Compiler: Apple clang version 15.0.0


The following snippet is converted from the RDataFrame Crash Course (section Defining Custom Columns)

void tmp(){
  ROOT::RDataFrame d(10); // an RDF that will generate 100 entries (currently empty)
  int x = -1;
  auto d_with_columns = d.Define("x", [=]()mutable->int { return ++x; })
                         .Define("xx", [=]()mutable->int { return x*x; });
  // d_with_columns.Snapshot("myNewTree", "newfile.root");
  d_with_columns.Display()->Print();
  std::cout << "Original x is " << x << std::endl;;
}

The result is, somewhat to my surprise,

root [0]
Processing tmp.C...
+-----+---+----+
| Row | x | xx |
+-----+---+----+
| 0   | 0 | 1  |
+-----+---+----+
| 1   | 1 | 1  |
+-----+---+----+
| 2   | 2 | 1  |
+-----+---+----+
| 3   | 3 | 1  |
+-----+---+----+
| 4   | 4 | 1  |
+-----+---+----+
Original x is -1

I had imagined that capturing by value would make the first column ”x” all 0s.
So my guess is the captured value is static across function calls as we iterate over the rows?
But then if the captured variable is static, how come the second column isn’t affected?

My question is:
When should I expect this “static” behavior? I’d like to avoid surprises.

Many thanks for your time!

To (partially) answer my own question, it seems that the problem originates from my lack of understanding when it comes to C++ lambdas.

  int x = -1;
  auto ppx = [x]()mutable->int{return ++x;};
  std::cout << ppx() << "\n";
  std::cout << ppx() << "\n";
  std::cout << ppx() << "\n";

The above results in

root [3] .x tmp.C
0
1
2

for some reason. It appears that the capture clause does not behave like a function parameter when it comes to capturing by value.

Hi @unmovingcastle !

The two lambdas you use

each have their own copy of x, captured by value from x from outside scope.
The squaring lambda doesn’t know about copy of x which is incremented inside the first one.

The tutorial you are following shows how to define new columns based on existing ones.
In your case it would be

auto square = [](double x) { return x*x; };
d_with_xx = d.Define("xx", square, {"x"});

Also, you can rewrite the first lambda to make its internal counter clearly visible:

[x=-1]()mutable->int { return ++x; }
1 Like