Hi,
This is related to a similar question here: https://root-forum.cern.ch/t/rdataframe-shuffles-events-for-tmva-scoring
I was asking in particular about using Foreach as the way I currently add things like weights from another file/tree is something like this.
In the code below, the indices between the TTrees align so I didn’t have to worry about matching the right weight but I know this won’t always be the case when using Snapshot and MT.
//C++ function to be called by RDataFrame
double copy_vec(std::vector<double> &vec,int index){
return vec[index];
}
//Fill a vector for the weights I want
for (int i =0; i < nentries;i++){
double nsig_sw = data->get(i)->getRealValue("nsig_sw",0,true);
nsig_sw_vec.push_back(nsig_sw);
}
//Define a new branch in RDataFrame. Must disable MT for this step
ROOT::DisableImplicitMT();
auto df2 = df.Define("nsig_sw","copy_vec(nsig_sw_vec,rdfentry_)");
This clearly won’t be performant for large datasets since it doesn’t use MT. I will be working with ~100M candidate datasets.
A first step would be to make a 2D vector with the index number and weight rather than the 1D vector currently used so with this information would it be possible to use ForEach with MT?
Thanks
_ROOT Version:6.22.0
Platform: Not Provided
Compiler: Not Provided