Dear ROOT experts,
I have a ROOT DataFrame containing events each one tagged with a certain energy value.
Is it possible to select the N events with the highest energy value ?
A similar topic can be found here: Sorting a DataFrame
Is possible to use something similar for this case ?
Thanks
Enrico
_ROOT Version 6.22
_Platform: Linux (CentOS 7)
_Compiler: gcc 7 via DevToolSet 7
Hi Enrico,
depending on the size and complexity of the dataset, you can:
- use
Take to extract vectors with all entries, then get the indices of the N events with the highest energy values and inject those into a new RDF that writes them out
-
book a custom action that implements a “max N elements” operation
- use a
Foreach that keeps an updated list of the contents of the N events with highest energy
In any case you will need to do one pass over the dataset to extract these N events and then another pass over the N events to write them out. To write the entries of a vector using RDF you can use something like:
std::vector<double> my_energies;
ROOT::RDataFrame(my_energies.size())
.Define("energy", [&](ULong64_t entry) { return my_energies[entry]; }, {"rdfentry_"})
.Snapshot("t", "f.root", {"energy"});
Cheers,
Enrico