Please read tips for efficient and successful posting and posting code
ROOT Version: 6.24.06
Platform: Not Provided
Compiler: Not Provided
I have list of tuples which define chronologically ordered time ranges like
time_ranges = [(t1,t2),(t3,t4),(t5,t6) … ]
I have time stamp MET for each event.
I want to select events with criterion
(t1 <= MET <= t2 ) || (t3 <= MET <= t4 ) || (t5 <= MET <= t6 ) and so on…
The number of tuples in list can be arbitrary. How can I do this using dataframes ? Can dataframes be merged ?
Hi @Chinmay; I am sure @eguiraud can help you with this.
Cheers,
J.
Hi @Chinmay ,
are the pairs of time ranges a Python-only information or are they e.g. stored in a ROOT file, so C++ code could load them too?
In Python, you can do the following:
import ROOT
import numpy as np
time_ranges = np.array([(1,2),(3,4),(5,6)])
@ROOT.Numba.Declare(['float'], 'bool')
def check_timestamp(MET):
for (start, end) in time_ranges:
if start <= MET <= end:
return True
return False
if __name__ == "__main__":
c = ROOT.RDataFrame(10).Define("MET", "rdfentry_ / 2.").Filter("Numba::check_timestamp(MET)").Count()
print(c.GetValue())
I am not sure I understand this question. You can merge datasets vertically (with a TChain or by simply passing multiple input files to RDF) or horizontally (adding trees as friends of other trees).
Cheers,
Enrico
Cool. On C++ side, I have ‘ranges’ class that can do work of check_time_stamp.
Ok, if you can code the equivalent of check_timestamp
in C++ it might result in better performance (it might or might not make a difference in your usecase).
Cheers,
Enrico