ROOT Version: 6.16 and 6.18 Platform: Linux Compiler: GCC 7.4.0
Hello, I am profiling an application that depends on ROOT TObjects and TRefs. We process events in parallel and in this sense each thread usually creates, copies and destroys thousands of TObjects that contain TRefs.
When we use more than 8 threads we start to see performance degradation due to some global locking in the TRef assignment operator. Take this case where 60% of the wait time is spent in TRef.
Please note that this exact issue was previously reported for ROOT version 6.12 and 6.14
Although the penalty of using TRefs in our use case has decreased by ROOT version 6.16, the problem still exists and whenever a thread is copying a TRef all other threads lock waiting for it to finish.
A sample code that describes our use case was provided in the linked issue along with stack traces.
Is it possible to bypass the need for such locks ROOT::TReentrantRWLock? or is it possible to reduce the penalty of using TRefs?
re: TRefArray: we had some very strange issues when writing TRefArrays to a file and accessing them later on. This is why we exchanged them for a vector. Would you expect a better performance when using TRefArray?
and all that said … as far as I can tell if the TObject are really new and then the lock should not be taken often as their is a short cut caching the information … (i.e. I really need a reproducer … even if it is the full example).
In theory the TRefArray should better as they are designed to lookup in the global table ‘less often’ (i.e. one per collection rather than once per object) [That is ‘of course’ minus any bugs and over-sights ]