Sort RDataFrame

Dear ROOT experts,
I have a ROOT DataFrame containing events each one tagged with a certain energy value.
Is it possible to select the N events with the highest energy value ?

A similar topic can be found here: Sorting a DataFrame
Is possible to use something similar for this case ?

Thanks
Enrico


_ROOT Version 6.22
_Platform: Linux (CentOS 7)
_Compiler: gcc 7 via DevToolSet 7


Hi Enrico,
depending on the size and complexity of the dataset, you can:

  • use Take to extract vectors with all entries, then get the indices of the N events with the highest energy values and inject those into a new RDF that writes them out
  • book a custom action that implements a “max N elements” operation
  • use a Foreach that keeps an updated list of the contents of the N events with highest energy

In any case you will need to do one pass over the dataset to extract these N events and then another pass over the N events to write them out. To write the entries of a vector using RDF you can use something like:

std::vector<double> my_energies;
ROOT::RDataFrame(my_energies.size())
   .Define("energy", [&](ULong64_t entry) { return my_energies[entry]; }, {"rdfentry_"})
   .Snapshot("t", "f.root", {"energy"});

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.