RDataFrame: Defining new column evaluated as a function of external values in Python

Hi Spandan,
at present, RDF cannot execute python lambdas during the event loop.
Among the obstacles to overcome to make it possible there is the python Global Interpreter Lock (GIL),
which effectively would make RDF multi-threading useless.

A workaround for your specific scenario would be to declare x, y and z to the interpreter and then use them as you would use C++ variables:

x = loadSF("0-100")
y = loadSF("100-200")
ROOT.gInterpreter.Declare("""
   const double x = double(TPython::Exec("x"));
   const double y = double(TPython::Exec("y"));
""")

# Now that gInterpreter knows the C++ variable `x`, you can use it in your `Define` expressions
df = df.Define("weight", "if (A < 0) return x else return y;");

Hope this helps,
Enrico

1 Like