Put lambdas of different signatures in a vector for RDataFrame

Dear experts,

In this post, Running RDataFrame's Define In For Loop, an example of putting expressions-as-strings in a vector then define dataframes based on the expressions in a loop is provided.

I’m wondering if I can do something like this:

auto lambdas = vector<F>{[](double x) { return x; }, [](double x, double y) { return x+y; }};
auto names = vector<string>{"a", "b"};
auto params = vector<vector<string>> {{"x"}, {"x", "y"}};

initDf = RDataFrame("tree", "ntp.root");

auto applyDefine(RNode df, vector<F> lambdas, vector<string> names, vector<vector<string> params, int idx=0) {
  if (i == names.size()) return df;

  return applyDefines(df.Define(names[i], lambdas[i], params[i]), lambdas, names, paras, i + 1);
}

applyDefine(initDf, names, lambdas, params);

Also, in the doc ROOT: ROOT::RDF::RInterface< Proxied, DataSource > Class Template Reference, the function pointer is of type F, but I think the doc doesn’t link the definition of F properly, as it links to a macro in a md5_* file. Could you provide some hint on the implementation of F?

Thanks.


ROOT Version: 6.24.2
Platform: Linux
Compiler: 10.3.0


Hi @yipengsun ,
In C++ each lambda has its own different type which only the compiler knows, so you cannot create a std::vector<lambda> container. You can create a vector of std::function and then pass lambdas which will be converted into the function type

std::vector<std::function<double()>> functions;
functions.push_back([]{ return 1.; });

But that only works with functions of the same signature, since std::vector can only contain objects of the same type. From your example above I see already two different signatures, so that wouldn’t be possible at all.

As for the F type you see in the documentation, that’s just the typename of the template parameter, a generic callable type that you may pass to the Define operation. That includes a lambda, a function, an std::function, a class that has an operator() .
Cheers,
Vincenzo

2 Likes

I see. So the only way to make it work would be to declare these functions of different signatures in gInterepreter, and call them by the string (say, "func(x)"), then these Define’s will be JIT’ed at run-time, by the interpreter (even if the source code is compiled).

Is my understanding right?

Hi @yipengsun ,
In C++17, with some template trickery, you can use std::tuple and fold expressions instead of a vector of lambdas and a for loop. Example coming.

You can write e.g. this:

int main() {
  std::tuple funcs{[] { return 42; }, [](ULong64_t) { return 1; }};

  ROOT::RDataFrame df(10);
  std::vector<std::vector<std::string>> inputs{{}, {"rdfentry_"}};
  std::vector<std::string> outputs{{"x"}, {"y"}};

  auto df2 = ApplyDefines(df, outputs, funcs, inputs);

  df2.Display()->Print();
}

where ApplyDefines can be implemented e.g. like this:

#include <ROOT/RDataFrame.hxx>
#include <string>
#include <tuple>
#include <utility>
#include <vector>

template <typename Funcs, std::size_t... Idx>
auto ApplyDefinesHelper(ROOT::RDF::RNode df,
                        const std::vector<std::string> &outputs, Funcs &funcs,
                        const std::vector<std::vector<std::string>> &inputs,
                        std::index_sequence<Idx...>) {
  auto applyOne = [&](auto f, int i) {
    df = df.Define(outputs[i], f, inputs[i]);
  };

  (applyOne(std::get<Idx>(funcs), Idx), ...);

  return df;
}

template <typename Funcs>
auto ApplyDefines(ROOT::RDF::RNode df, const std::vector<std::string> &outputs,
                  Funcs &funcs,
                  const std::vector<std::vector<std::string>> &inputs) {
  constexpr auto indices = std::make_index_sequence<std::tuple_size<Funcs>{}>();
  return ApplyDefinesHelper(df, outputs, funcs, inputs, indices);
}

Cheers,
Enrico

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.