Hi! I’m working on a function to “smear” my dataset, adding a randomly generated uncertainty to each entry in one or more columns of an RDataFrame.
I have in some sense managed to do this, importing c++ code via gInterpreter and then using the Define() function, but it applies the same random number to every entry. The number changes if run the whole code again, so I guess it’s because Define doesn’t run the function element-wise. Here is the relevant code:
ROOT.gInterpreter.ProcessLine('#include "addrand.h"') df = dataframe.Define("{}_smeared".format(to_smear), "addrand_gauss(time[0])")
The addrand.h
file contains this (written in c++):
#include <iostream>
#include <random>
#include <vector>
#include <ctime>
double addrand_gauss(double x, double t_resol=1) {
std::default_random_engine gen(std::time(nullptr));
std::normal_distribution<double> nd(0, t_resol);
x += nd(gen);
return x;
}
I did try using ForEach(), which would run the function for each entry, but ipython replies “‘RDataFrame’ object has no attribute ‘ForEach’”. I see in other posts in this forum that pyROOT doesnt yet support ForEach.
Is there any way I can work around this? I have thought about using the AsNumpy() function, but since my dataset is quite big, it takes extremely long to process.
Any suggestions are very welcome!
Luna
ROOT Version: 6.22/06
Platform: Linux (WSL)
Compiler:
ROOT installed through conda forge