OO-oriented parallelism: processing ROOT TTrees + TObjects

Hi everyone, I am currently looking into options to loop over TTrees containing custom objects but taking full advantage of parallelism for high-performance computing. I understand RDataFrames allow for very convenient looping over TTrees containing simple float / int values or even arrays; however, in principle, parallelism could also be employed when looping over any other kind of object (even if at a performance penalty). Is there already a way of accomplishing something like that in ROOT? I guess RDataFrame supports custom objects, but examples would really be great.

A bit more about my use case: ideally, I would like to have an “analysis class object” that takes in the custom class that is stored within the TTree and reads it in and processes. The analysis class object is basically a descriptor of the operations to be done on the custom object (including custom function calls) and could be instanced/copied as many times as needed for parallel processing. I understand this may incur in a performance loss as vectorized, in bulk operations may not be harnessed in the same way, but that’s fine. Would this be doable somehow with RDataFrames?

Thank you very much!

Hi @ddobrigk ,

RDataFrame can read anything ROOT I/O can read, you can just use objects with it.

As a first step you should be able to write e.g.:

ROOT::RDataFrame df("yourtree", "yourfile.root");
std::vector<YourProcessorClass> processors(df.GetNSlots());
df.ForeachSlot([&](unsigned int slot, YourCustomClass &obj)  {
  processors[slot].Process(obj);
});

This is just an idea, there might be better ways to structure things depending on what exactly you want to accomplish.

See the user guide for more details about what the different methods do.

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.