Multicore/multithreading

I’m looking at doing a multithreaded application. This seems to be a new feature in root 6 or so:
https://root.cern.ch/how/how-express-parallelism-many-cores

However, this isn’t specific enough. What does this mean:
“One file per thread is read/written”

Surely, that can’t mean that there is only one file open in each thread - that makes no sense, since it should be perfectly allowable to have files 1 and 2 open in thread A and files 3 and 4 open in thread B.

What is the real restriction here? Is it:

  • A TFile object cannot be written to /read from multiple concurrent threads
  • A TFile object cannot be written to /read from multiple concurrent threads without a mutex lock
  • A physical file cannot be written/read from mulitple concurrent threads, even if there is a seperate TFile object in each thread
  • A TFile object cannot be passed from one thread to another?

The examples are not too helpful - and in fact are slightly confusing. For example, there are histograms that are created concurrently… could the user call h1->Write() on these in the threads, or would that blow things up?

I currently have code that fork()s to get around the problem of multiprocessing, but this is fugly for more interactive work (requiring IPC and all manner of crud).

Thanks for any insight,
Nathaniel


ROOT Version: 6 or higher
Platform: Not Provided
Compiler: Not Provided


1 Like

Hi Nathaniel,
what you can do is instantiate one or more TFile per thread and have each thread read/write to its thread-local TFile(s).
Using the same TFile from different threads, even protecting concurrent access with a mutex, as well as creating a TFile in one thread and using it from another, can result in quirky behavior.

Multi-thread parallelism in ROOT can be very tricky, due to the large amount of global state that it relies on and that is implicitly modified by object constructors, destructors and other methods.
Please ask if you have any further doubts.

Also consider using implicit parallelism if that’s an option for your usecase.

Hope this helps,
Enrico

Can you comment more?

  • What is the correctly-stated condition for TFiles? It sounds like:
    “A TFile object should never be accesssed outside it’s owner’s thread, although objects stored in that TFile can be accessed from other threads providing there is no concurrent read/write operation”. Is that right?

  • What does the EnableImplicitMT() do that isn’t done by the EnableThreadSafety()? What conditions does it protect against?

  • Are the answers to these questions in any way dependent on whether I’m using the ROOT thread tools, or using others (i.e. BOOST threads)?

Given that EnableThreadSafety has been called:

  • a TFile can only be accessed from the thread that constructed it. “Accessing” a TFile includes, of course, writing to it, even if it done indirectly, via Write methods of other objects.
  • gDirectory is a thread-local variable (each thread gets its own gDirectory).

In general, no method of any ROOT object can be safely called concurrently: no TObject, for instance, internally locks when a method is called on it. Also watch out for methods that are apparently read-only (and therefore look like they are safe to call concurrently), for example TChain::GetEntries.

Here’s the doc. It does not add protections, it tells ROOT that it can execute certain common operations in ROOT’s internal thread-pool (and it implies EnableThreadSafety).

The (very few) things you can do with ROOT objects concurrently do not depend on your threading model/technology. EnableImplicitMT tells ROOT it can internally parallelize certain common operations, and ROOT will schedule them to an internal Intel TBB thread pool.

Depending on your use case I would suggest:

  • do not write parallel code explicitly and take advantage of ROOT::EnableImplicitMT to get a speedup on the supported listed operations
  • if that’s not going to cut it, design your program so that parallelism is trivial: have different threads work on different objects/data, merge at the end in a single thread (a la MapReduce, so to speak). EnableThreadSafety should guarantee synchronized access to all of ROOT’s global variables/resources under the hood

N.B. TThread is to be considered deprecated in favour of std::thread.

Thanks for clarifying!

Is there any condition under which EnableImplicitMT() would be a BAD idea?
(except of course where cores are limited?)

Mixing/nesting your own threading model with ROOT’s implicit multi-threading might cause surprises. @dpiparo or @Axel might correct me here, but letting users mix implicit multi-threading with user-level explicit threading is a non-goal.

The recommended way to mix your own concurrent processing with ROOT’s implicit multi-threading is by scheduling your operations on the same thread-pool that ROOT uses, ROOT::TThreadExecutor.
But admittedly it’s probably not a battle-tested use case.

Also mixing ROOT’s implicit multi-threading with your own explicit multi-processing is bad (mixing forks and spawning of new threads can cause deadlocks due to fork only cloning the one thread that it was called from and not cloning the other threads, which might be holding mutexes, which therefore will never be released in the new process).

Yeah, it’s all a bit of a nightmare. But I’ve got applications that demand short wall-clock-time processing, and the data goes into and out of root files.

For example, an object has got an array of long arrays, each of which needs processing. Multithreading on these is straightforward, but filling histograms with these is not. (e.g. Two threads both calling SetBinContent on the same histogram can cause rare but crash-level problems, even if they are not operating on the same bin.)

I’ve not mixed threading and forking yet, although I intend to. However, I’ll make sure that there’s only one thread in the master process, and that TFiles are local to a single process.

I believe you can get 80% of the performance benefits of multi-threading with 20% of the headaches if you design your program to not have threads stepping on each other’s data. Also using a task scheduler (e.g. ROOT::TThreadExecutor, for simple stuff, or Intel TBB for more sophisticated scenarios) takes away the complexity of handling synchronization primitives yourself.

By the way, this is the implementation of TH1::SetBinContent: definitely not thread safe, but it’s a write operation on a ROOT object, so no surprise.
You can use a TThreadedObject<TH1> to transparently create one histogram per thread and merge at the end…if you know the axes limits beforehand.

Best of luck, and feel free to post here if you get stuck on something.
Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.