Thread safety and TString

Hello,

I have some issues (i.e. crashes) with TString in a multi-threaded environment.

Attached is a multi-threaded example using TStrings.

My work flow is the following:
I create a set of strings in the main thread.
I make copies of those strings in the worker threads and do something with them.
Executing this example will always results in a crash…

However, this is only related to TString. The moment I replace TString with std::string everything is fine (uncomment the line #define TString string in the attached example).

I did some debugging with valgrind’s helgrind and the reasons seems to be that TString doesn’t make a clean copy of the data, but keeps references/pointers to the original data. And this is of course fatal across thread boundaries…

Example helgrind output:
valgrind --tool=helgrind -v --error-limit=no CrashTest

==7240== Possible data race during read of size 4 at 0x90f51bc by thread #3
==7240== at 0x49F2DBC: TRefCnt::RemoveReference() (TRefCnt.h:43)
==7240== by 0x4A0D154: TStringRef::UnLink() (TString.h:441)
==7240== by 0x4A0874C: TString::~TString() (TString.cxx:396)
==7240== by 0x804B276: CrashTest::DoLoopB() (CrashTest.cxx:171)
==7240== by 0x804B318: StartThreadB(void*) (CrashTest.cxx:206)
==7240== by 0x6B0DC05: TThread::Function(void*) (TThread.cxx:696)
==7240== by 0x402961F: mythread_wrapper (hg_intercepts.c:202)
==7240== by 0x72BD96D: start_thread (pthread_create.c:300)
==7240== by 0x798DA4D: clone (clone.S:130)
==7240== This conflicts with a previous write of size 4 by thread #2
==7240== at 0x804B5EF: TRefCnt::AddReference() (TRefCnt.h:42)
==7240== by 0x804B640: TString::TString(TString const&) (TString.h:226)
==7240== by 0x804BA71: __gnu_cxx::new_allocator::construct(TString*, TString const&) (new_allocator.h:105)
==7240== by 0x804B8F7: std::vector<TString, std::allocator >::push_back(TString const&) (stl_vector.h:737)
==7240== by 0x804AFF9: CrashTest::DoLoopA() (CrashTest.cxx:144)
==7240== by 0x804B2F0: StartThreadA(void*) (CrashTest.cxx:198)
==7240== by 0x6B0DC05: TThread::Function(void*) (TThread.cxx:696)
==7240== by 0x402961F: mythread_wrapper (hg_intercepts.c:202)

So… is it intended that one cannot use TString’s across thread boundaries, because TString is simply not thread safe. In this case I probably have to come up with my own string class…

But I have my doubts that this can be intended, because TString is so deeply integrated into ROOT that one will always come across a situation like the one in the example when one uses ROOT in a multi-threaded environment?

Any advice is appreciated.

Thanks,
Andreas
CrashTest.cxx (5.56 KB)

Just for reference, I use ROOT 5.28.00 on Ubuntu 10.4

Hi,

the problem is that TString uses a copy on write algorithm. Fixing this is now high on the priority list. Stay tuned.

Cheers, Fons.

[quote=“rdm”]Hi,

the problem is that TString uses a copy on write algorithm. Fixing this is now high on the priority list. Stay tuned.

Cheers, Fons.[/quote]

It would be greatful, We have the same crash problem on Scientific Linux 5.6 x64.

Hi,

this has now been fixed in the trunk. The TString implementation does not use reference counted copy-on-write anymore but a more modern and efficient algorithm using short string optimization (SSO) which stores <14 on 64-bit and <11 on 32-bit strings in the internal TString data structure without malloc. Larger strings are still malloced. Overall this is slightly more efficient then the old implementation, but more importantly it is thread safe. Your ChrashTest.cxx example works fine now.

Cheers, Fons.

[quote=“rdm”]
but a more modern and efficient algorithm using short string optimization (SSO) which stores <14 on 64-bit and <11 on 32-bit in the internal TString data structure without malloc. Larger strings are still malloced. Overall this is slightly more efficient then the old implementation, but more importantly it is thread safe.
Cheers, Fons.[/quote]

Is it thread safe only for strings <14 on 64-bit and <11 on 32-bit machine?

Thank you.

[quote=“gertsen”][quote=“rdm”]
but a more modern and efficient algorithm using short string optimization (SSO) which stores <14 on 64-bit and <11 on 32-bit in the internal TString data structure without malloc. Larger strings are still malloced. Overall this is slightly more efficient then the old implementation, but more importantly it is thread safe.
Cheers, Fons.[/quote]

Is it thread safe only for strings <14 on 64-bit and <11 on 32-bit machine?

Thank you.[/quote]

They are optimized for such short strings (each string object has a data member - built-in array for short strings).
For longer strings - it depends on malloc, so, if you are using library with thread safe malloc, the string is thread safe.

Hi,

this is in all cases thread safe as string instances are completely independent.

Cheers, Fons.