Could be vectorised in some way. I.e. in case I have to define several variables (”x” in the above script), I can add them all in one step without having to loop over them, namely avoiding something of the kind:
df = ROOT.RDataFrame(10)
for prediction in predictions:
df = add_to_df(df)
(I’m aware the above script won’t work but is just a draft to explain myself better)
Thanks a lot!
Davide
ROOT Version: 6.30 and above Platform: Not Provided Compiler: Not Provided
I am perhaps a bit confused, sorry about that: what would be the advantage of the vectorisation in setting up the computation graph, i.e. using the nice function you wrote to add Defines?
Are you looking perhaps to a syntax more elegant than the for loop in python?
Thanks for your swift reply, and sorry for my delayed one. I was out of office the past week. The reason to avoid a for loop of Define is because in case of addition of many new variables (around 20 or more) it is extremely slow. So I was wondering if there could be a way that the arguments of Define could be vectorised and the new variables could be added without looping over them to speed up the computation. Of course if there is some other solution that doesn’t involve vectorisation but that would avoid looping over the variables that would work as well.
I will provide with a reproducible snippet to test speed with the for loop.
I am not sure I get this. Defining a new column should be very fast, it’s just a matter of booking a feature in the computation graph. Can you provide evidence of such a slow down? I don’t exclude it’s there, but it would be really unexpected (and unwanted).
I’m sorry for having raised the problem without timing exactly all the steps, the define was not the slowest operation. I’ve been able to speed up where possible other parts of the workflow that were slowing down the new variable addition.