Hello!
When using multi-threading (ROOT::EnableImplicitMT
), ROOT files written using RDataFrame don’t have TStreamerInfo for RVec branches. It becomes problematic when reading RVec
branches using uproot
in python. It is okay to read such branches using ROOT from my experience.
Single-threaded
When using single-thread, everything is okay for me. A ROOT file has a decent TStreamerInfo for RVec branch like vector<float,ROOT::Detail::VecOps::RAdoptAllocator<float> >
.
import ROOT
df = ROOT.RDataFrame(100)
element_type_list = [
"bool",
"char",
"unsigned char",
"short",
"unsigned short",
"int",
"unsigned int",
"long",
"unsigned long",
"long long",
"unsigned long long",
"float",
"double",
]
for element_type in element_type_list:
name = "rvec_" + element_type.replace(" ", "_")
expr = f"RVec<{element_type}>(5, 0)"
df = df.Define(name, expr)
df.Snapshot("test", "/tmp/test_single.root")
root_file = ROOT.TFile("/tmp/test_single.root")
for each in root_file.GetStreamerInfoList():
print(each.GetName())
TNamed
TObject
TList
TSeqCollection
TCollection
vector<bool>
vector<char,ROOT::Detail::VecOps::RAdoptAllocator<char> >
vector<unsigned char,ROOT::Detail::VecOps::RAdoptAllocator<unsigned char> >
vector<short,ROOT::Detail::VecOps::RAdoptAllocator<short> >
vector<unsigned short,ROOT::Detail::VecOps::RAdoptAllocator<unsigned short> >
vector<int,ROOT::Detail::VecOps::RAdoptAllocator<int> >
vector<unsigned int,ROOT::Detail::VecOps::RAdoptAllocator<unsigned int> >
vector<long,ROOT::Detail::VecOps::RAdoptAllocator<long> >
vector<unsigned long,ROOT::Detail::VecOps::RAdoptAllocator<unsigned long> >
vector<Long64_t,ROOT::Detail::VecOps::RAdoptAllocator<Long64_t> >
vector<ULong64_t,ROOT::Detail::VecOps::RAdoptAllocator<ULong64_t> >
vector<float,ROOT::Detail::VecOps::RAdoptAllocator<float> >
vector<double,ROOT::Detail::VecOps::RAdoptAllocator<double> >
TTree
TAttLine
TAttFill
TAttMarker
ROOT::TIOFeatures
TBranchElement
TBranch
TLeafElement
TLeaf
TString
TBranchRef
TRefTable
TObjArray
listOfRules
Multi-threaded
When ROOT::EnableImplicitMT
is called with any number of threads, TStreamerInfo
for RVec is missing.
import ROOT
ROOT.EnableImplicitMT(1) # 2, 3, or 0
df = ROOT.RDataFrame(100)
element_type_list = [
"bool",
"char",
"unsigned char",
"short",
"unsigned short",
"int",
"unsigned int",
"long",
"unsigned long",
"long long",
"unsigned long long",
"float",
"double",
]
for element_type in element_type_list:
name = "rvec_" + element_type.replace(" ", "_")
expr = f"RVec<{element_type}>(5, 0)"
df = df.Define(name, expr)
root_file = ROOT.TFile("/tmp/test_multi.root")
for each in root_file.GetStreamerInfoList():
print(each.GetName())
TNamed
TObject
TList
TSeqCollection
TCollection
TTree
TAttLine
TAttFill
TAttMarker
ROOT::TIOFeatures
TBranchElement
TBranch
TLeafElement
TLeaf
TString
TBranchRef
TRefTable
TObjArray
TArrayD
TArray
TArrayI
listOfRules
Cheers,
Seungjin
ROOT Version: v6.22.02
Platform: CentOS Linux release 7.8.2003 (Core)
Compiler: gcc (GCC) 4.8.5
Python: Python 3.6.8