Passing arguments to function called by .Define in RDataFrame

Dear all

I have to calibrate data in my columns with a linear function. I would like to that with the use of a reusable function. In order to do that, I am looping .Define, as explained in this post. The problem I have now is how to pass information about which column is being called to my calibrating function, in order to get proper calibration parameters.

Calibrating Function below

ROOT::VecOps::RVec<float> calibrateMe(ROOT::VecOps::RVec<unsigned short> &inputArray)
{
	auto mySize = array.size();
	ROOT::VecOps::RVec<float> outputArray(mySize);
	for (size_t iii = 0; iii < mySize; iii++)
	{
		outputArray.at(iii) = B[iii] + A[iii] * inputArray[iii];
	}
	return outputArray;
}

Looped Define function

ROOT::RDF::RNode ApplyDefines(					
					ROOT::RDF::RNode df, 
					const std::vector<std::string> &colNames,
               		unsigned int iii = 0)
{
	if (iii == colNames.size())
	{
		return df;
	}

	std::string newColumn = colNames[iii];
	inputColumn.insert(0, cal_);
	return ApplyDefines(df.Define(newColumn.c_str(), calibrateMe, {colNames[iii].c_str()}), colNames, iii + 1);
}

I figured out that I could modify the first entry in the input tree to contain some ID, but that doesn’t seem to be an elegant solution. Does anybody have any idea?

Cheers,
Bogumił


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.22/00
Platform: Ubuntu 18.04
Compiler: gcc version 7.5.0


Hi @bzalewsk,
and welcome to the ROOT forum!

If I understand the task correctly, the simplest solution is probably to make calibrateMe an object, with the current implementation of calibrateMe that can be just copy-pasted in the object’s operator():

struct Calibrator {
   Calibrator(std::string columnToCalibrate) { ... };

   ROOT::VecOps::RVec<float>
   operator()(ROOT::VecOps::RVec<unsigned short> &inputArray) {
      	auto mySize = array.size();
        ROOT::VecOps::RVec<float> outputArray(mySize);
        for (size_t iii = 0; iii < mySize; iii++)
           outputArray.at(iii) = B[iii] + A[iii] * inputArray[iii];
        return outputArray;
   }
};

and then you create Calibrators as needed in ApplyDefines:

return ApplyDefines(df.Define(newColumn.c_str(), Calibrator(colNames[iii]), {colNames[iii]}), colNames, iii + 1);

(also note that you should not need the c_str().

Hope this helps!
Enrico

1 Like

It’s working like a charm! Thank you very much for your help.

Bogumił Zalewski

1 Like