Selecting events from multiple time ranges


Please read tips for efficient and successful posting and posting code

ROOT Version: 6.24.06
Platform: Not Provided
Compiler: Not Provided


I have list of tuples which define chronologically ordered time ranges like
time_ranges = [(t1,t2),(t3,t4),(t5,t6) … ]
I have time stamp MET for each event.
I want to select events with criterion
(t1 <= MET <= t2 ) || (t3 <= MET <= t4 ) || (t5 <= MET <= t6 ) and so on…
The number of tuples in list can be arbitrary. How can I do this using dataframes ? Can dataframes be merged ?

Hi @Chinmay; I am sure @eguiraud can help you with this.

Cheers,
J.

Hi @Chinmay ,
are the pairs of time ranges a Python-only information or are they e.g. stored in a ROOT file, so C++ code could load them too?

In Python, you can do the following:

import ROOT
import numpy as np

time_ranges = np.array([(1,2),(3,4),(5,6)])

@ROOT.Numba.Declare(['float'], 'bool')
def check_timestamp(MET):
    for (start, end) in time_ranges:
        if start <= MET <= end:
            return True
    return False

if __name__ == "__main__":
    c = ROOT.RDataFrame(10).Define("MET", "rdfentry_ / 2.").Filter("Numba::check_timestamp(MET)").Count()
    print(c.GetValue())

I am not sure I understand this question. You can merge datasets vertically (with a TChain or by simply passing multiple input files to RDF) or horizontally (adding trees as friends of other trees).

Cheers,
Enrico

Cool. On C++ side, I have ‘ranges’ class that can do work of check_time_stamp.

Ok, if you can code the equivalent of check_timestamp in C++ it might result in better performance (it might or might not make a difference in your usecase).

Cheers,
Enrico