There are two groups of TTrees that I would like to read together using RDataFrame. For each entry in the first tree there is a corresponding entry in the second, being the variable linking them called “eventNumber”. I have tried using the code below, but it doesn’t seem to work:
inFile=ROOT.TFile(inFiles[0],"READ")
if not tree_reco:
tree_reco=inFile.Get("nominal")
if not tree_truth:
tree_truth=inFile.Get("truth")
tree_truth.BuildIndex("runNumber","eventNumber")
tree_reco.AddFriend(tree_truth)
rdf=ROOT.RDataFrame(tree_reco)
rdf=rdf.Define("jet_assocs","associateObjectsToTruth(jet_pt,jet_eta,jet_phi,jet_e,t_wdecay1,t_wdecay2,tbar_wdecay1,tbar_wdecay2,b_from_t,b_from_tbar,eventNumber)")
rdf.Snapshot("nominal","/lustre/fs22/group/atlas/alopezso/Spanet/output.root")
In the snapshot, the eventNumber entry of the friend tree is saved under the name “truth_eventNumber”. I verify if for a specific entry in the output tree is correctly paired with the entry of the friend tree but it doesn’t seem to be the case (see below). Do you have any feedback on this issue ? The “truth” tree has more entries than the first.
I found out that it works fine when I don’t enable the multi-threading( remove the line ROOT.ROOT.EnableImplicitMT()). Is this an expected behaviour and, in that case, is there a way to run on indexed friend trees with multithreading ?
Sorry for my late reply. Actually, I was seeing this problem with Root v6.28.00. I have just moved to Root v6.28.04 and the problem seems to be solved !
Actually, it is working well when I use TTrees as input to the RDataFrame. However, I was trying now to use TChains and, although it seems to work, I am having issues like the ones below. Have you seen this error before ?
#7 0x00007f813af2b314 in TFree::GetBestFree(TList*, int) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
#8 0x00007f813af6a5d4 in TKey::Create(int, TFile*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
#9 0x00007f80a0b41dc2 in TBasket::CopyTo(TFile*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libTree.so
#10 0x00007f80a0bb72a2 in TTreeCloner::WriteBaskets() () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libTree.so
#11 0x00007f80a0bb8bf0 in TTreeCloner::Exec() () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libTree.so
#12 0x00007f80a0bcc298 in TTree::CopyEntries(TTree*, long long, char const*, bool) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libTree.so
#13 0x00007f80a0bcbced in TTree::Merge(TCollection*, TFileMergeInfo*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libTree.so
#15 0x00007f813af2a923 in TFileMerger::MergeRecursive(TDirectory*, TList*, int) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
#16 0x00007f813af29e7f in TFileMerger::PartialMerge(int) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
#17 0x00007f813af143d6 in ROOT::TBufferMerger::MergeImpl() () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
#18 0x00007f813af14a65 in ROOT::TBufferMerger::TryMerge(ROOT::TBufferMergerFile*) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
#19 0x00007f813af15220 in ROOT::TBufferMergerFile::Write(char const*, int, int) () from /cvmfs/sft.cern.ch/lcg/views/LCG_104/x86_64-centos7-gcc12-opt/lib/libRIO.so
that looks like a memory corruption issue, the most likely cause is a bug in the analysis code that e.g. writes over memory that it should not write over (beyond the last element of an array, for example), eventually causing problems in ~random places.
This is just a guess of course, based on the fact that a crash like that deep in the belly of ROOT I/O simply should not happen in normal conditions. It could also be that you are hitting a real bug we haven’t seen before, it’s just less likely.
One thing you can try is run the program (or a stripped down version of the program that still reproduces the crash) under valgrind, which will detect some classes of invalid memory accesses.
If that fails we’ll need a self-contained reproducer, as stripped down as possible, so that we can take a look on our side.