Enabling threads when using python multiprocessing?

Hi all,

I’ve been using RDataFrame in python/PyROOT to run over the CMS NanoAOD format with multi-threading enabled. From basic benchmarks I’ve done, it seems the gain of adding more threads diminishes exponentially (not a huge surprise) so I’ve found it’s most optimal on a 16 thread CPU to run four separate “jobs” with four threads each.

It’s obviously a little annoying to be watching four separate terminals while waiting to queue up the next set to process by hand. So I was hoping to instead use the python multiprocessing library to do this job instead.

What I’m wondering about is how the multiprocessing and ROOT multi-threading interact. My understanding is that each Process will use one thread of the CPU and if I use Pool, then I can tell it how many process/threads I would like to use at a time.

So say I have the setup above of four jobs at a time with four threads each. Should I do ROOT. EnableImplicitMT(16) to make all 16 threads available to root? Or ROOT. EnableImplicitMT(4) so that each process only uses four threads at a time?

From trying both and just watching htop, it seems doing ROOT. EnableImplicitMT(4) only uses four total threads even though I submit multiple Processes and so I think when I do ROOT. EnableImplicitMT(16), the separate Processes are all sharing the 16 threads.

Does anyone have experience with this or know the back end of ROOT and multiprocessing to have a guess of what’s going on? Is there a way to set the number of threads in ROOT per Process rather than per parent script?


Hi @lcorcodilos,

Do you call ROOT.EnableImplicitMT(4) before multiprocessing.Pool(4).map(worker, ...) or inside the worker?

Hi @berserker,

I call it outside of the worker. Maybe I should call it inside? And then each worker knows how many threads it can use?


Yes, you should try calling it inside.

Also beware of the pitfalls of mixing multi-threading and multi-processing, e.g.:

Certainly forking multiple processes before enabling implicit multi-threading (which creates a thread pool) is better than the opposite, but mixing processes and threads is a delicate matter.


Hi all,

Just wanted to report back on this that @berserker’s suggestion worked. If I enable multithreading in the function given to the worker, I can control the number of threads available to each process/worker.


What do you mean by «creates a thread pool»? I’ve looked at Threads count in /proc/<root_pid>/status and mere calling EnableImplicitMT doesn’t seem to create any threads.

Uhm, that might depend on how Intel TBB does things (which ROOT uses internally as thread pool manager).

EnableImplicitMT calls ROOT::Internal::GetPoolManager which constructs a ROOT::Internal::TPoolManager, which calls tbb::task_scheduler_init::initialize. I am not sure what this last call does exactly in terms of thread spawning.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.