Multiprocessing and ROOT C++ classes

Hi,

I am trying to parallelize some calculations steered by python using the multiprocess module. What I do is have some python class which contains a C++ ROOT class (it inherits from TObject, compiled with ACLIC, all the usual stuff). This ROOT class is a Minuit2 wrapper, and performs a minimization.

What I want to do is have a multiprocessing.Pool of diferent parameters which I want to minimize, and pass it the Python Class (calibrator in the example) which contains the C++ class already instantiated (this is because the setup of the C++ class is a little bit slow).

This _processFunction calls a the minimize class of the calibrator, which in turn calls the Minuit2->Minimize().

It seems very complicated but this setup is working nicely if I just work with plain python (no multiprocessing). As soon as I use multiprocessing, the line above produces an error which I assume is caused because the os.fork done by the multiprocessing module is not working correctly.

>>> mgr.process(calib, [('2497','1598'),('3507','588'),('5369','6918')]) Error in <TClass::BuildRealData>: Cannot find any ShowMembers function for ROOT::Minuit2::Minuit2Minimizer! Error in <TClass::BuildRealData>: Cannot find any ShowMembers function for ROOT::Math::Functor! Error in <TClass::BuildRealData>: Cannot find any ShowMembers function for ROOT::Minuit2::Minuit2Minimizer! Error in <TClass::BuildRealData>: Cannot find any ShowMembers function for ROOT::Math::Functor!

Has anybody done something similar to what I am intending to do? I could go for the dirty solution of instantiating the C++ class after the fork, but that would slow down the processing a great deal. In principle there should be no reason why this isn’t working, so probably I am doing something wrong. Any ideas?

Cheers,
Albert

PS: Since the code is obviously a little bit complicated, I haven’t posted any “simplified” version, just in case things are obvious and the solution is straightforward. If necessary, I will try to produce a simplified code.

Hi,

What does python use for multiprocessing? Does it fork entire program or does it use thread (and hence shared some memory)? If it is the later, you need to use the trunk as it improve the ability of the meta data to be setup by multiple-thread (alternatively you need to insure the meta data is setup properly and completely before starting up threads).

Cheers,
Philippe.

Hi Philippe,

multiprocessing uses fork if it can. Alternatively I could use multithreading, instead of multiprocess, but I would like to avoid that because objects are “shared” in a multithreaded environment.

By the way, I am using the ROOT trunk already.

Cheers,
Albert

[quote]multiprocessing uses fork if it can. [/quote]Fair enough, this is a good option. The symptoms indicates that either something is not copied properly during the fork and/or the (re)initialization are not done (i.e. looks like the dictionary are not properly loaded in the forked process). If you provide a complete running example, maybe your python export (Wim) might be able to help out.

Cheers,
Philippe.

Hi,

athena (the ATLAS framework) has a multiprocessing mode, and probably qualifies as a complicated application. Haven’t seen anything like this error message.

So yes, if you can send a (simplified) working script to reproduce?

Cheers,
Wim

OK, i will try to create a simplified example, but it might take some time :frowning: I’ll update as soon as I have it.