[SOLVED] Access single entries in RDataFrame

Hello experts,

I am working with RDataFrame. I already have a dataframe with many columns.
Now I want to create a new column, its values are calculated using three other columns.

df = df.Define("new_column", "column1*column2/column3")

My problem now is, that some values in column3 are zero. In these cases, I want the value of “new_column” to be zero, too.

Is there a way to set single entries to zero? The ForEach function only does something with single values, it does not save them back in the DataFrame, right?

Maybe one more thing: column1 is correlated to column3, as it is something like "column5*column4*column3", defined before. In other words: If the entry/value in column3 is zero, then the same value in column1 is also zero.

So, I did not found any solution to this problem and/or how to access single entries and I hope someone here has an idea, suggestion or solution.

Thanks!


ROOT Version: 6.22


Hi @eneb ,
df.Define("new_column", "column3 == 0. ? 0. : column1*column2/column3") ?

(with the usual caveats regarding exact comparison with floating point numbers – it only works if you know the value is exactly zero, otherwise you need < epsilon for some small epsilon)

Cheers,
Enrico

1 Like

Hi @eguiraud ,
thanks for the quick answer. I will try it, I didn’t saw that I can use an expression in the Define function, this seems like a very good idea.

If it didn’t work, I will write here again.

Edit: It did work!

Thanks again!
Cheers

Great!

You can write arbitrary C++ code in those strings.

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.