Global Lock when copying Trefs with multiple threads


ROOT Version: 6.16 and 6.18
Platform: Linux
Compiler: GCC 7.4.0


Hello, I am profiling an application that depends on ROOT TObjects and TRefs. We process events in parallel and in this sense each thread usually creates, copies and destroys thousands of TObjects that contain TRefs.

When we use more than 8 threads we start to see performance degradation due to some global locking in the TRef assignment operator. Take this case where 60% of the wait time is spent in TRef.


Please note that this exact issue was previously reported for ROOT version 6.12 and 6.14
Although the penalty of using TRefs in our use case has decreased by ROOT version 6.16, the problem still exists and whenever a thread is copying a TRef all other threads lock waiting for it to finish.

A sample code that describes our use case was provided in the linked issue along with stack traces.

Is it possible to bypass the need for such locks ROOT::TReentrantRWLock? or is it possible to reduce the penalty of using TRefs?

Regards,
Mohamed

1 Like

May be @pcanal can help you with this question.

Hi @pcanal any chance you could have a look at this?

@pcanal, could you take a look? Thanks!

Do you have a standalone reproducer?

Also since you have a lot of TRef objects, would a TRefArray be applicable?

Hi @pcanal,

re: standalone: we will try to come up with something, maybe @mmoanis has something already? Otherwise I think the standalone program to reproduce it in Copying TRefs and accessing TRef data from multiple threads should also still work

re: TRefArray: we had some very strange issues when writing TRefArrays to a file and accessing them later on. This is why we exchanged them for a vector. Would you expect a better performance when using TRefArray?

Thanks for looking into this!
Simon

Note the example in reported might be over-simplified. If it is not oversimplified, then upgrading the Derived class so that:

        Event(const Object* object) {
            TRef ref(object);
            for(int i = 0; i < 25; i++) {
                charges_.emplace_back(ref);
            }

works would solve the problem (i.e. adding a constructor in Derived that take a TRef as an argument.

Cheers,
Philippe.

PS. Or even further :slight_smile:

        threads.emplace_back([object = std::move(object)]() {
            TRef ref(object.get());
            for(int i = 0; i < 100; i++) {
                Event event{ref};
                event.run();
            }
        });

and all that said … as far as I can tell if the TObject are really new and then the lock should not be taken often as their is a short cut caching the information … (i.e. I really need a reproducer … even if it is the full example).

In theory the TRefArray should better as they are designed to lookup in the global table ‘less often’ (i.e. one per collection rather than once per object) [That is ‘of course’ minus any bugs and over-sights :slight_smile: ]

I’ll have a closer look at your suggestion and also see how I can prepare the full example for you to run. @mmoanis feel free to chime in…

Concerning TRefArrays I dug out our issue when we abandoned them: when they were written in a different process than read, the pointers were not correct anymore and the objects could not be loaded: https://gitlab.cern.ch/allpix-squared/allpix-squared/issues/114#note_891510

TRefArray read/write: obviously this should work :frowning: … If you can provide a reproducer, I’ll take a look at fixing it.