Releasing GIL in TTree::Draw in PyROOT


I’d like to speed up filling of some histograms from trees in a PyROOT script by utilizing a thread pool. I know that PyROOT does not normally release GIL, so the code would always run in a single thread. To overcome this, I tried setting TTree.Draw._threaded = True, but this crashes Python with a segmentation violation and some complains from Cling. Please find below a minimal example reproducing the problem. Is this not supported or am I doing something wrong? I’m on MacOS 10.13.2 and use ROOT 6.12/04 and Python 3.6.4.


#!/usr/bin/env python

from array import array
import os
import queue
import threading
from uuid import uuid4

import ROOT

# Request release of GIL in TTree.Draw
ROOT.TTree.Draw._threaded = True

def create_inputs(n):
    """Create input files for testing."""
    rGen = ROOT.TRandom3(0)
    for iFile in range(n):
        fileName = 'test_{}.root'.format(iFile + 1)
        if os.path.exists(fileName):
        f = ROOT.TFile(fileName, 'create')
        tree = ROOT.TTree('tree', '')
        x = array('f', [0])
        tree.Branch('x', x, 'x/F')
        for ev in range(10000):
            x[0] = rGen.Gaus()

def worker(fileQueue):
    """Fill histograms from files in the queue, one by one."""
    while True:
            fileName = fileQueue.get_nowait()
        except queue.Empty:
        hist = ROOT.TH1F(uuid4().hex, '', 10, -3., 3.)
        f = ROOT.TFile(fileName)
        tree = f.Get('tree')
        tree.Draw('x>>' + hist.GetName(), '', 'goff')

if __name__ == '__main__':
    nThreads = 2
    fileQueue = queue.Queue()
    for i in range(nThreads):
        fileQueue.put('test_{}.root'.format(i + 1))
    threads = []
    for i in range(nThreads):
        t = threading.Thread(target=worker, args=(fileQueue,))
    for t in threads:


that’s an interesting project!
Before diving in the more technical aspects, I’d like to suggest the solution we already provide to fill histograms in parallel, TDataFrame.
In your case, filling the histogram with the values of “x”, would look like this:

ROOT.ROOT.EnableImplicitMT() # this activates the internal thread pool of ROOT
d = ROOT.Experimental.TDataFrame("tree", "test_*.root")
h = d.Histo1D("x") # this fills the histogram of x in parallel, leveraging the aforementioned pool



Thank you for pointing me to this nice new feature, I was not aware of it. From a quick look it seems it will also do the job for my real-life problem.

Still, if somebody knows how to release GIL for PyROOT without crashing Python, I would be curious.


Why would this:

        hist = ROOT.TH1F(uuid4().hex, '', 10, -3., 3.)
        f = ROOT.TFile(fileName)
        tree = f.Get('tree')

be something you expect to work properly in a worker queue?

Aside, I tried to reproduce a crash (on Linux), but couldn’t. Not even when setting the check interval to 1. Perhaps you can post the traceback of the segfault?

Indeed, the last three lines should be protected with a lock, so that tree.Draw is always executed with gROOT as the current directory. Thanks for spotting this. But this is not related to the crash since with a wrong current directory I would simply fill a temporary histogram instead of the intended one.

I get the same crash (even when adding the lock) on SL 6.5 as well. Running with ROOT 6.10/04 compiled with GCC 5.3.0 and Python 3.5.3. I’m attaching the log (11.3 KB).

The cause is obviously that make_wrapper failure, which “should not happen.” ™ I don’t know what’s wrong with it, though. Could you call Draw() on the main thread once, and only then start the workers? Point being that the make_wrapper call will then not occur on any threads (it’s a one-off initialization).

From that failure though, the GetCallFunc call in Cppyy.cxx will call PyErr_Format without holding the GIL. Apparently (based on the stack trace) that does not cause a problem directly, but it screws up the global state sufficiently for things to break when the method holder tries to report its own error. I haven’t looked into detail how that happened as regardless that PyErr_Format call is certainly wrong as-is and worthy of a bug report.

(Aside, it’s not there in cppyy master, for unrelated reasons, so master won’t suffer from this problem.)

I confirm that if TTree.Draw is called from the main thread before starting the thread pool, subsequent multithreading processing runs fine. As such, this is not really a suitable solution, though. Is it possible to call make_wrapper without actually setting up reading of a real tree (like on the TTree.Draw function object or something)?

No, can’t: everything is done completely lazily.

It’s not just make_wrapper, though: GetCallFunc memoizes the result and the way that that is done, isn’t thread safe either. The whole thing (and several other data structures besides in Cppyy.cxx) needs a shared_mutex.

Thanks for the confirmation. I ended up protecting the call of TTree.Draw with a context manager like this:

class OneTimeLock:
    def __init__(self):
        self.lock = threading.Lock()
        self.enabled = True
    def __enter__(self):
        if not self.enabled:
    def __exit__(self, exc_type, exc_value, traceback):
        if self.enabled:
            self.enabled = False

With it, the first time TTree.Draw is executed it will run in a single thread, and when this first call is over, parallel execution will be allowed. This seems to solve the initial problem, but then things get messed up for TTreeFormula somehow: it complains about not being able to compile some gibberish, and later the program dies to a segfault. I do call


at the start of the script.

I guess TDataFrame it is then…

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.