Home | News | Documentation | Download

How to pass arguments to Define when RDataFrame object is in another class

rdataframe

#1

Hi,

I have written analysis package so that boiler plate things common to different
analyses happen in one class, and user would write a subclass that would be specific to each analysis. The RDataFrame object is instantiated in the superclass, and many Defines are made there. I wanted the writer of the subclass not access RDF object directly, but through a method in the superclass, shown below. I honestly don’t know templates well, so I don’t know if it’s even correct.

template <typename T, typename std::enable_if<!std::is_convertible<T, std::string>::value, int>::type = 0>
void NanoAODAnalyzerrdframe::defineVar(std::string varname, T function,  const RDFDetail::ColumnNames_t &columns)
{
	_rlm = _rlm.Define(varname, function, columns);
}

It compiles fine, but fails during linking when I make a call like

defineVar("sphericityQ", ::sphericity , {"cleanjet4vecs"});
undefined reference to `void NanoAODAnalyzerrdframe::defineVar<ROOT::VecOps::RVec<float> (*)(std::vector<ROOT::Math::LorentzVector<ROOT::Math::PtEtaPhiM4D<double> >, std::allocator<ROOT::Math::LorentzVector<ROOT::Math::PtEtaPhiM4D<double> > > >&), 0>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ROOT::VecOps::RVec<float> (*)(std::vector<ROOT::Math::LorentzVector<ROOT::Math::PtEtaPhiM4D<double> >, std::allocator<ROOT::Math::LorentzVector<ROOT::Math::PtEtaPhiM4D<double> > > >&), std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > > const&)'
collect2: error: ld returned 1 exit status

How should I change the method?

If I make _rlm, the RDF object, public, it works as expected, but would like to avoid accessing it directly.

_rlm = _rlm.Define("sphericityQ", ::sphericity , {"cleanjet4vecs"}); 

Regards,
Suyong


_ROOT Version: 6.15/01
_Platform: Linux
_Compiler: 6.4.1



#2

Hi Suyong,
your method definition looks ok to me. An "undefined reference" error means that the compiler could find a declaration for the method, but not the definition. Make sure that you put the definition of NanoAODAnalyzerrdframe::defineVar in the header file, that’s where templates live.
In fact, you can just define the method inside your class.

Let me know if this helps.
Cheers,
Enrico


#3

Hi Enrico,

Thank you, that solved the problem.

Regards,
Suyong


#4

Perfect! Templates must all live inside headers.

By the way, pretty cool project you have there – is that a framework to analyze nanoAODs with RDataFrame?
Is it just for personal use or do you plan to let other people adopt it?

Cheers,
Enrico


#5

Yes, it is to analyze NanoAOD’s. I’ve written a simple framework taking advantage of what RDataFrame has to offer and we can eliminate a lot of the copy pasting people usually do. So, the size of the analysis code is much smaller by several times. Ability to write tree easily is really a nice feature of RDFrame. For now, it’s being developed and debugged, so I’m having my students test it out using large dataset. But, I’ll release it once it has the features I want and is stable enough.

Regards,
Suyong


#6

Cool, please let us know when you do :slight_smile:


#7

Hi @Suyong_Choi ,

Just out of curiosity, how large is exactly that dataset? How long does your code take to run on that dataset?

Cheers!
Enric


#8

Hi Enric,

CMS jet (JetHT) data collected in 2016, converted to nanoaod format, has hundreds of files totalling around 450 GB. Split into 7 jobs takes an hour to hour and a half to process.

Regards,
Suyong