RDataframe gives a flattened column list from a snapshot


ROOT Version: 6.22/00
Platform: CentOs 8
Compiler: gcc version 8.3.1 20190507 (Red Hat 8.3.1-4) (GCC)


I have encountered an issue with restarting the analysis from a snapshot of the dataframe.
The dataframe snapshot was taken with a selected column list, such as

df.Snapshot(myTree, myRootFile, ["ecal", "Cer14_5"]) ,

with “ecal” and “Cer14_5” two different data structures.

The root file of this snapshot looks fine in TTree. However, when I read it back with dataframe, it gives a flattened column list that prevents me from accessing the data structure’s components.

df = ROOT.RDataFrame(myTree, myRootFile)

print(rdf.GetColumnNames()) gives:
{ "ecal.height", "ecal.integral", "ecal.time", "ecal.pos", "ecal.left", "ecal.right", "ecal.overflow", "ecal", "Cer14_5.ped.ped.mean", "Cer14_5.ped.ped.err", "Cer14_5.ped", "Cer14_5.peaks.peaks.height", "Cer14_5.peaks.peaks.integral", "Cer14_5.peaks.peaks.time", "Cer14_5.peaks.peaks.pos", "Cer14_5.peaks.peaks.left", "Cer14_5.peaks.peaks.right", "Cer14_5.peaks.peaks.overflow", "Cer14_5.peaks", "Cer14_5.raw", "Cer14_5"}

So when I use ecal.time (ecal is a fdec::Peak while its component time is a double) in the code, its type (double) was mistakenly determined as fdec::Peak, and it gives an error as:
candidate function not viable: no known conversion from 'fdec::Peak' to 'double' for 2nd argument

Hi @chao2232 ,
welcome to the ROOT forum!

The fact that RDF thinks ecal.time is of type fdec::Peak is a bug, and it looks like one that was recently fixed. Could you please check whether the problem is still present in the ROOT nightlies or in v6.22/02 and, if yes, could you please share the offending file with me so I can debug/fix the problem?

Cheers,
Enrico

Thank you, v6.22/02 solved this issue!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.