RDataFrame MT performance running on remote files

Thanks! I originally tried the templated version, but couldn’t make it work. I will try again.
I think I still need something like

if (df.GetColumnType("variable")=="int") 
{
df.Define(...,Build2DObservable<int>,...).Aggregate(...,fill<int>,...);
}
else if (df.GetColumnType("variable")=="ROOT::VecOps::RVec<int>")
{
df.Define(...,Build2DObservable<ROOT::VecOps::RVec<int>>,...).Aggregate(...,fill<ROOT::VecOps::RVec<int>>,...)
}
...

Or can I resolve the type in a better way?
Zdenek

Hello, I unfortunatelly, couldn’t get it resolve the correct form of fill for the scalar and vector forms?
I had to use different names

template <typename T>
void fillscalar(TH2D &h, const My2DObservablestruct<T> &c) {

template <typename T>
void fillvector(TH2D &h, const My2DObservablestruct<T> &c) { 
//and here change RVec<T> to just T

(it isn’t a problem in my case)

If you don’t know the types in advance, that’s one way to do it. You can easily measure whether the setup time becomes too large.

That works too!

Hello,
The question is more if RDataFrame is able to resolve the correct type of the templated aggregator?
It works on a c++ level - simple example is attached test1.C (720 Bytes) as the functions have same names but different parameters, but I couldn’t get it work in RDataFrame Aggregate
test2.C (1.5 KB)
(it’s supposed to run on a simple tree with two branches - int x and std::vector<int> vecypokus1.root (6.2 KB) )

 //this doesn't work
  //auto total1 = df.Aggregate(agg<int>,add_all,"x",all1);
  //auto total2 = df.Aggregate(agg<std::vector<int>>,add_all,"y",all1);
  //this works
  auto total3 = df.Aggregate(aggscalar<int>,add_all,"x",all1);
  auto total4 = df.Aggregate(aggvector<std::vector<int>>,add_all,"vecy",all1);

I get something like

/Applications/root_v6.22.02/include/ROOT/RDF/RInterface.hxx:2029:18: note: 
      candidate template ignored: couldn't infer template argument 'AccFun'
   RResultPtr<U> Aggregate(AccFun aggregator, MergeFun merger, std::stri...

Ah that’s tricky, and the error message is horrible, sorry about that.
It’s because agg<int> and agg<std::vector<int>> are in principle ambiguous (they could both refer to both overloads) and the right overload can only be resolved when you call the function but RDF tries to figure out the signature of the function before calling it…

Anyway, simple workaround, make it a functor:

template <typename T>                                                                                                   
struct Aggregator {                                                                                                     
   void operator()(std::vector<int> &a, const T &b)                                                                     
   {                                                                                                                    
      std::cout << "f1" << std::endl;                                                                                   
      a.push_back(b);                                                                                                   
   }                                                                                                                    
};                                                                                                                      
                                                                                                                        
template <typename T>                                                                                                   
struct Aggregator<std::vector<T>> {                                                                                     
   void operator()(std::vector<int> &a, const std::vector<T> &b)                                                        
   {                                                                                                                    
      std::cout << "f2"                                                                                                 
                << " " << b.size() << std::endl;                                                                        
      a.insert(a.end(), b.begin(), b.end());                                                                            
   }                                                                                                                    
};   

...

auto total1 = df.Aggregate(Aggregator<int>{},add_all,"x",all1);                                                       
auto total2 = df.Aggregate(Aggregator<std::vector<int>>{},add_all,"y",all1);     

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.