Segmentation fault in case of TRef and TRefArray circular-kind-of references


ROOT Version: 6.22/00 Built for linuxx8664gcc
Platform: Ubuntu 20.04 5.4.0-42-generic #46-Ubuntu
Compiler: g++ (Ubuntu 9.3.0-10ubuntu2) 9.3.0


I think I have a problem with TRefArray but first of all, let me ask a couple of questions.

1. Is it OK to store an object containing TRefArray and TRef into a TRefArray? In other words, can an element of a TRefArray point to an object containing other TRefArray and TRef? What if the contained object TRef points to the container object?
2. Can those objects reside in different branches?
3. What if I delete an object from TClonesArray?

READ NEXT POST FIRST!

If you want to know why I am here asking these questions, below are the details of my problem. If you do not have time or find it boring you do not need to read it all.

Let me first describe our program structure. It is a Geant4 simulation whose output is a ROOT TTree containing TObjects with a structure very similar to the ROOT Event.cxx example. I mean there are hits, tracks, vertices, etc…

In particular, there are two classes describing a track and a cluster of hits. A track is a more high-level object describing a whole particle track. A cluster is just a bunch of hits (for example the output of a track seeding algorithm). A track can be made of many clusters. Both track and clusters are many of many hits.

With this preamble maybe it is clear why I am using the following class structure.

Both the Track class and Cluster class inherit from a virtual class called HitsSet. The HitsSet class contains a TRefArray pointing to the hits objects.

The Track class contains a TRefArray pointing to the cluster objects. The Cluster class contains a TRef pointing to the parent Track (remember every track is made of many Clusters).

All is fine until I try to TTree::Fill the output TTree with my objects. Inside the TTree::Fill method, I randomly get a segmentation fault every 1000 events or so. The segfault error is at the bottom of the post.

I get the segmentation fault only when I delete one Cluster or one Track from the TClonesArray. Both the clusters and tracks are stored in two separate TClonesArray.

To make it clearer, this is the class structure that I have

class HitsSet : public TObject {
protected:

  TRefArray hits_;
}

class Track : public HitsSet {
protected:

  TRefArray clusters_;
}

class Cluster : public HitsSet {
protected:

  TRef parent_track_;
}

class Event : public TObject {

  TClonesArray hits_;
  TClonesArray tracks_;
  TClonesArray clusters_;
}

Things that I do:

  • I only construct the TClonesArray at the beginning of the run
  • I add the objects to the TClonesArray like this: new(tracks_[num_tracks_++]) = Track();
  • I clear the TRefArray, TRef and TClonesArray with the Clear("C") method every event.
  • I add the objects to the TRefArray like this: clusters_.Add((TObject *) p_cluster);
  • I add the objects to the TRef like this: parent_track_ = (TObject *) p_track;
  • I reset the TProcessID::fgNumber every event (crash if I do it or not do it, anyway)

I could write a sample program but I would like someone to answer my question first to avoid wasting too much time.

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f3a2a4e4c2a in __GI___wait4 (pid=26968, stat_loc=stat_loc
entry=0x7ffe590a7668, options=options
entry=0, usage=usage
entry=0x0) at ../sysdeps/unix/sysv/linux/wait4.c:27
#1  0x00007f3a2a4e4beb in __GI___waitpid (pid=<optimized out>, stat_loc=stat_loc
entry=0x7ffe590a7668, options=options
entry=0) at waitpid.c:38
#2  0x00007f3a2a4540e7 in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:172
#3  0x00007f3a3124a73e in TUnixSystem::StackTrace() () from /home/neo/Programs/ROOT/v6-22-00/lib/libCore.so
#4  0x00007f3a312475c5 in TUnixSystem::DispatchSignals(ESignals) () from /home/neo/Programs/ROOT/v6-22-00/lib/libCore.so
#5  <signal handler called>
#6  0x00007f3a30eada00 in int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) () from /home/neo/Programs/ROOT/v6-22-00/lib/libRIO.so
#7  0x00007f3a30d3ed85 in TStreamerInfoActions::VectorPtrLooper::GenericWrite(TBuffer&, void*, void const*, TStreamerInfoActions::TConfiguration const*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libRIO.so
#8  0x00007f3a30c351f8 in TBufferFile::ApplySequenceVecPtr(TStreamerInfoActions::TActionSequence const&, void*, void*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libRIO.so
#9  0x00007f3a30a748ea in TBranch::FillImpl(ROOT::Internal::TBranchIMTHelper*) [clone .part.0] () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#10 0x00007f3a30a828ab in TBranchElement::FillImpl(ROOT::Internal::TBranchIMTHelper*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#11 0x00007f3a30a825be in TBranchElement::FillImpl(ROOT::Internal::TBranchIMTHelper*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#12 0x00007f3a30aefd46 in TTree::Fill() () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#13 0x0000561ddf2b3191 in B2EventAction::EndOfEventAction (this=0x561de4531b10, p_event=0x561de2ff1c60) at /home/neo/Code/WAGASCI/WagasciMC/src/B2EventAction.cc:435
#14 0x00007f3a2f783213 in G4EventManager::DoProcessing (this=0x561de1304060, anEvent=<optimized out>) at /home/neo/Programs/geant4/source/event/src/G4EventManager.cc:264
#15 0x00007f3a2f829aeb in G4RunManager::ProcessOneEvent (this=0x561de13038d0, i_event=1208) at /home/neo/Programs/geant4/source/run/src/G4RunManager.cc:414
#16 0x00007f3a2f8280a4 in G4RunManager::DoEventLoop (this=0x561de13038d0, n_event=13817, macroFile=<optimized out>, n_select=<optimized out>) at /home/neo/Programs/geant4/source/run/src/G4RunManager.cc:381
#17 0x00007f3a2f827de2 in G4RunManager::BeamOn (this=0x561de13038d0, n_event=13817, macroFile=0x0, n_select=-1) at /home/neo/Programs/geant4/source/run/src/G4RunManager.cc:276
#18 0x0000561ddf2931d7 in main (argc=7, argv=0x7ffe590f3808) at /home/neo/Code/WAGASCI/WagasciMC/src/B2MC.cc:125
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x00007f3a30eada00 in int TStreamerInfo::WriteBufferAux<char**>(TBuffer&, char** const&, TStreamerInfo::TCompInfo* const*, int, int, int, int, int) () from /home/neo/Programs/ROOT/v6-22-00/lib/libRIO.so
#7  0x00007f3a30d3ed85 in TStreamerInfoActions::VectorPtrLooper::GenericWrite(TBuffer&, void*, void const*, TStreamerInfoActions::TConfiguration const*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libRIO.so
#8  0x00007f3a30c351f8 in TBufferFile::ApplySequenceVecPtr(TStreamerInfoActions::TActionSequence const&, void*, void*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libRIO.so
#9  0x00007f3a30a748ea in TBranch::FillImpl(ROOT::Internal::TBranchIMTHelper*) [clone .part.0] () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#10 0x00007f3a30a828ab in TBranchElement::FillImpl(ROOT::Internal::TBranchIMTHelper*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#11 0x00007f3a30a825be in TBranchElement::FillImpl(ROOT::Internal::TBranchIMTHelper*) () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#12 0x00007f3a30aefd46 in TTree::Fill() () from /home/neo/Programs/ROOT/v6-22-00/lib/libTree.so
#13 0x0000561ddf2b3191 in B2EventAction::EndOfEventAction (this=0x561de4531b10, p_event=0x561de2ff1c60) at /home/neo/Code/WAGASCI/WagasciMC/src/B2EventAction.cc:435
#14 0x00007f3a2f783213 in G4EventManager::DoProcessing (this=0x561de1304060, anEvent=<optimized out>) at /home/neo/Programs/geant4/source/event/src/G4EventManager.cc:264
#15 0x00007f3a2f829aeb in G4RunManager::ProcessOneEvent (this=0x561de13038d0, i_event=1208) at /home/neo/Programs/geant4/source/run/src/G4RunManager.cc:414
#16 0x00007f3a2f8280a4 in G4RunManager::DoEventLoop (this=0x561de13038d0, n_event=13817, macroFile=<optimized out>, n_select=<optimized out>) at /home/neo/Programs/geant4/source/run/src/G4RunManager.cc:381
#17 0x00007f3a2f827de2 in G4RunManager::BeamOn (this=0x561de13038d0, n_event=13817, macroFile=0x0, n_select=-1) at /home/neo/Programs/geant4/source/run/src/G4RunManager.cc:276
#18 0x0000561ddf2931d7 in main (argc=7, argv=0x7ffe590f3808) at /home/neo/Code/WAGASCI/WagasciMC/src/B2MC.cc:125
===========================================================

I realized that what triggers the segfault is not adding a reference but deleting one.
Later in the code, I remove a track from the the list of tracks and that is what triggers the error during TTree::Fill.

So my question is: how to safely remove an object from a TClonesArray with all its references?

Now I am using the TClonesArray::Remove(TObject *) method.

Hi @LastStarDust,

@pcanal will surely be able to help here on monday. To save him some time, he should read

  1. the questions in first post
  2. the second post
  3. the rest of first post.

In the meanwhile I am getting around the problem by not removing any object from the TClonesArray and just flagging them as bad (adding a new flag to the object members). This is not ideal because of the disk space wasted to record tracks that I do not need.

If you find the question too unclear let me know. I will try to clarify it.

Yes

Yes with some caveats about how to load them (see the documentation for TTree::BranchRef).

I would expect the RemoveAt to work … Can you provide a reproducer? Thanks.

I have a minimal example of what I am trying to achieve but I was not able to reproduce the segmentation fault. I will keep trying to make the example more similar to the actual code.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Thank you for reopening this thread.
@pcanal I was able to reliably reproduce this issue. I am attaching a minimal reproducer. You just have to run the compile.sh script and then you should see the segfault message. test.zip (2.9 KB)

I suspect that this thread has become too confused, so I am opening a new thread to better frame the issue.

This topic was automatically closed after 13 days. New replies are no longer allowed.