Are TObject::Copy and/or TObject::Clone thread-safe?

Hello,
In a multi-threaded application I need to create a copy of histograms (and TGraph). Are ::Copy(…) and/or ::Clone(…) methods thread safe?

Regards,
Andrea Dotti

Hi Andrea,

Copy, Clone should be thread-safe

Rene

Hello

I think there is one exception which I have noticed.
Calling Clone function for the first time is not thread-safe.
Here is a stack trace of the crash which happened with
my application when it was calling Clone from two
concurrent threads.

This is the crashed thread:
Program received signal SIGSEGV, Segmentation fault.
#0 0x01f4886d in G__memfunc_para_setup ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#1 0x01f493a1 in G__parse_parameter_link ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#2 0x01f4fa1f in G__memfunc_setup_imp ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#3 0x01f5035a in G__memfunc_setup ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#4 0x07a45fe4 in G__setup_memfuncTBufferFile ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libRIO.so
#5 0x01f4f703 in G__incsetup_memfunc ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#6 0x01f080fc in G__getfunction ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#7 0x01f372ee in G__new_operator ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#8 0x01f68492 in G__exec_statement ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#9 0x01eda52c in G__exec_tempfile_core ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#10 0x01edb867 in G__exec_tempfile_fp ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#11 0x01f7495c in G__process_cmd ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#12 0x04dc8563 in TCint::ProcessLine ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#13 0x04d0123e in TApplication::ProcessLine ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#14 0x04d42b4a in TROOT::ProcessLine ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#15 0x04d146eb in TDirectory::CloneObject ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#16 0x04d2750a in TObject::Clone ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#17 0x04d2692d in TNamed::Clone ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#18 0x018fa1c3 in dqm_algorithms::BinThreshold::execute ()

and this is another thread which was calling Clone at the same time:
(gdb) thread 5
[Switching to thread 5 (Thread -1240163424 (LWP 958))]#0 0x01ef9660 in G__fgetc ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
(gdb) bt
#0 0x01ef9660 in G__fgetc () from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#1 0x01f61699 in G__exec_statement ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#2 0x01f3063c in G__loadfile ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#3 0x01f76389 in G__process_cmd ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCint.so
#4 0x04dc8563 in TCint::ProcessLine ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#5 0x04dc86e4 in TCint::ProcessLineSynch ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#6 0x04d02f1e in TApplication::ExecuteFile ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#7 0x04dcbe05 in TCint::ExecuteMacro ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#8 0x04d45c13 in TROOT::Macro ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#9 0x04d2ad8a in TPluginManager::LoadHandlerMacros ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#10 0x04d2b16d in TPluginManager::LoadHandlersFromPluginDirs ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#11 0x04d2b9d5 in TPluginManager::FindHandler ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#12 0x04d5a85f in TSystem::FindHelper ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#13 0x04ddd180 in TUnixSystem::OpenDirectory ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#14 0x04dcc4db in TCint::LoadLibraryMap ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#15 0x04dc7a40 in TCint::EnableAutoLoading ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#16 0x04cffde1 in TApplication::TApplication ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#17 0x04d016d7 in TApplication::CreateApplication ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#18 0x04d42b28 in TROOT::ProcessLine ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#19 0x04d146eb in TDirectory::CloneObject ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#20 0x04d2750a in TObject::Clone ()
from /afs/cern.ch/atlas/project/tdaq/cmt/dqm-common/dqm-common-00-08-02/installed/…/external/i686-slc4-gcc34-dbg/lib/libCore.so
#21 0x04d2692d in TNamed::Clone ()

Sergey,

You are right. The dictionaries for all classes involved in the Clone operation must be available before Cloning, otherwise the creation of the in memory dictionaries is not thread safe.

Rene

Thank you Sergei and Rene for your investigations.
I would like to add here my contributions.
I made some tests and I think I can conclude the following:

1- TObject::Clone() is not thread safe
2- TObject::Copy() is thread safe only if TH1::AddDirectory(kFALSE) is used
3- A delete of a TH1 is not thread-safe

To demonstrate this I prepared a simple function that is called in a TThread (code can be found
in the attached test.tar archived). Four programs are avialable:

  • testClone starts TThreads attempting to make a clone of a histogram. Please note that,
    to avoid the problem described by Sergei an initial Clone is perfomed in the main() function. This
    program crashes with the attached stack trace: testClone.bt
  • testCopyBad starts TThreads attempting to make a Copy of a histogram. The application crashes with
    the stack trace included in testCopyBad.bt
  • testCopy is a modified version that does not crash in my tests. The statement
    TH1::AddDirectory(kFALSE) has been added to the main() function. I suspect this addition avoid
    accessing shared, non-protected, resources (gDirectory?).
  • testDelete starts TThreads attemtpting to new and then delete a TH1F. The situation here is a bit
    more tricky since I am able to have crashes with different stack-trace, two examples are shown in
    testDelete.bt. I suspect the problem being in the handling of TString, representing histogram
    names, axis names etc. Also note that I have crashes only with a very large number of threads
    (command line option, i.e. testDelete 100).

For our concrete problem, we thus will use TObject::Copy, however finding a solution to have a
thread safe delete would be highly appreciated. Can you suggest us anything?

Thank you very much,
Andrea
testDelete…bt.txt (2.28 KB)
testCopyBad.bt.txt (899 Bytes)
testClone.bt.txt (2.54 KB)
tests.tar (10 KB)

Andrea,

TObject::Clone should now be thread-safe in the SVN trunk.
I do not see any problem with TObject::Delete.
Making TH1::Copy in case of TH1::AddDirectory(kTRUE) will take some more time to make it thread-safe.

Rene

Rene,

I would like to make an update on this issue. Unfortunately I have an impression that almost all ROOT classes are mutually-non-thread safe because of the gNullRef global variable used by the TString class. I would be happy to be wrong, but my current feeling is that a constructor of any ROOT class which has an attribute of TString type is potentially unsafe.
For example any histogram object has at least one attribute of the TAxis class which in turn has TString attribute called fTimeFormat, which is always initialized into gNullRef in the constructor. This means that if there are several threads which are creating new histograms concurrently then the Reference Counting of the gNullRef variable can be corrupted which sooner or later will lead to an attempt of deleting this object which crashes the application.
I think that is something which is happening now with my application, which is dying with the stack traces, which I have put below. It’s interesting to note that there is common pattern between all stacks: they are always originated from a destructor of ROOT class which has TString attribute, and the actual crash is always happening in the TString destructor.

#2 0x00314691 in abort () from /lib/libc.so.6
#3 0x0034b24b in __libc_message () from /lib/libc.so.6
#4 0x003530f1 in _int_free () from /lib/libc.so.6
#5 0x00356bc0 in free () from /lib/libc.so.6
#6 0x00930871 in operator delete () from /usr/lib/libstdc++.so.6
#7 0x009308cd in operator delete[] () from /usr/lib/libstdc++.so.6
#8 0xf722b5ef in TString::~TString () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#9 0xf71fc19f in TEnvRec::~TEnvRec () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#10 0xf726a450 in TCollection::GarbageCollect () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#11 0xf726e845 in TList::Delete () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#12 0xf726c4bc in THashList::Delete () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#13 0xf71faadc in TEnv::~TEnv () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#14 0xf72aaaa9 in TCint::~TCint () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#15 0xf722325e in TROOT::~TROOT () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#16 0xf72232fb in __tcf_0 () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so

#2 0x00314691 in abort () from /lib/libc.so.6
#3 0x0034b24b in __libc_message () from /lib/libc.so.6
#4 0x003530f1 in _int_free () from /lib/libc.so.6
#5 0x00356bc0 in free () from /lib/libc.so.6
#6 0x00930871 in operator delete () from /usr/lib/libstdc++.so.6
#7 0x009308cd in operator delete[] () from /usr/lib/libstdc++.so.6
#8 0xf72825ef in TString::~TString () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#9 0xf724d9aa in TDirectory::~TDirectory () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#10 0xf7279f39 in TROOT::~TROOT () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so

#7 0x009308cd in operator delete[] () from /usr/lib/libstdc++.so.6
#8 0xf72935ef in TString::~TString () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#9 0xf5f0d02c in TAxis::~TAxis () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so
#10 0xf5f6c488 in TH1::~TH1 () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so
#11 0xf5f8c515 in TH2::~TH2 () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so
#12 0xf5f97bde in TH2D::~TH2D () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so
#13 0xf5fd8952 in TProfile2D::~TProfile2D () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so

#8 0xf72275ef in TString::~TString () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#9 0xf726a5e2 in TList::~TList () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#10 0xf72689fd in THashTable::Clear () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#11 0xf72684af in THashList::Delete () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libCore.so
#12 0xf5ea0fde in TAxis::~TAxis () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so
#13 0xf5f00488 in TH1::~TH1 () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so
#14 0xf5f158e6 in TH1F::~TH1F () from /sw/atlas/sw/lcg/app/releases/ROOT/5.22.00d/slc4_ia32_gcc34/root/lib/libHist.so

Dear Sergei,

Your analysis is likely correct. We are going to investigate and put a mutex on this variable if this really the case We will let you know.

Rene

Hi,

I am aware of this issue, the problem is that the TRefCnt is not thread safe and I’ll fix this asap. I’ll implement it using and TAtomicCount as on most machines that is much more efficient then a plain mutex.

Cheers, Fons.