Parallel PyROOT Recommendations?

I have a general question for experts. I sometimes wish to try to make my code more parallel to exploit multithreaded or multicore systems. I have had some success blindly using the multiprocessing module in python, but experts on IRC have told me to avoid multiprocessing. I have also had limited success with “handythread” (scipy.org/Cookbook/Multithreading).

Is there a generally recommended parallel computing module or framework for use with PyROOT? There are so many to choose from, I don’t wish to invest time learning one only to find out that it does not play well with ROOT.

For context, most of my applications are in the form of loading a TTree and processing the entries one at a time, though often the results of previous entries change what I do to later entries.

Thanks.

Hi,

I’m not a fan of mixing C++ and Python in multiprocessing, because it keeps locks on the Python side. A segfault from C++ will then hang the application ad infinitum, as the code never returns to Python to release the lock. For ATLAS, I rolled my own multiprocessing, specific to the application at hand (AthenaMP), which could safely segfault. LHCb has been happy with http://www.parallelpython.com/ in the past. I’m not familiar with multithreading from numpy.

But if it’s working with TTrees, I’d recommend PROOF with TPySelector (although I’m not sure how the dependencies that you talk about would work out).

Cheers,
Wim

P.S. In the path to the future (PyPy/cppyy), we expect to be able to offer a single-processor view of a multi-core CPU using software transactional memory (STM). Currently, prototypes exists, but it’s very experimental and still slower than single-threaded due to lack of JIT support.