Piping PyROOT object in multiprocessing causes segmentation fault

To create a large number of histograms in parallel that I recombine in the end, I used python’s multiprocessing with a pipe. This worked fine for ROOT 6.14/09 with python 2.7, but when I switched to ROOT 6.24/07, which was compiled against python 3.9.6 in CMSSW with slc7_amd64_gcc10, I get a segmentation violation. Adding ROOT.EnableImplicitMT() does not seem to solve the issue.

It seems to happen only after manually deleting the histograms to keep the memory clean. I managed to reduce my complex code to this minimal example that should reproduce the segmentation violation:

import ROOT
from multiprocessing import Process, Pipe
ROOT.EnableImplicitMT()

def gethist(name,endin):
  #hist = ROOT.TH1D(name,name,10,0,10)
  hist = ROOT.TNamed(name,name)
  print(">>> Created %r"%hist)
  endin.send(hist) # send histgram to endout

if __name__ == '__main__':
  hists = [ ]
  processes = [ ]
  for i in range(1,6):
    name = "process_%d"%i
    endout, endin = Pipe(False)
    process = Process(target=gethist,args=(name,endin),name=name)
    process.start()
    process.endout = endout # for receiving histgram 
    processes.append(process)
  for process in processes:
    process.join() # wait for processes to end
    hist = process.endout.recv() # get hist from process
    print(">>> %s returns: %r"%(process.name,hist))
    hists.append(hist)
  print(">>> Deleting...")
  for hist in hists:
    hist.Delete() # causes segmentation fault at end ?
  print(">>> Done")

If you remove the for loop deleting the objects, there is no segmentation violation. I am not an expert in memory management, but I think the segmentation fault happens when python does its automated garbage collection, or PyROOT does something similar, judging from this snippet of the segmentation violation:

$ python3 test.py
>>> Created <cppyy.gbl.TNamed object at 0x78782e0>
>>> Created <cppyy.gbl.TNamed object at 0x1bb2ca0>
>>> Created <cppyy.gbl.TNamed object at 0x77dd1e0>
>>> Created <cppyy.gbl.TNamed object at 0x1bb2eb0>
>>> Created <cppyy.gbl.TNamed object at 0x77e25c0>
>>> process_1 returns: <cppyy.gbl.TNamed object at 0x7880dc0>
>>> process_2 returns: <cppyy.gbl.TNamed object at 0x787dad0>
>>> process_3 returns: <cppyy.gbl.TNamed object at 0x787db60>
>>> process_4 returns: <cppyy.gbl.TNamed object at 0x787dbf0>
>>> process_5 returns: <cppyy.gbl.TNamed object at 0x787dc80>
>>> Deleting...
>>> Done
 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007fb638d82659 in waitpid () from /lib64/libc.so.6
#1  0x00007fb638cfff62 in do_system () from /lib64/libc.so.6
#2  0x00007fb638d00311 in system () from /lib64/libc.so.6
#3  0x00007fb6321e6ceb in TUnixSystem::StackTrace() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/lib/libCore.so
#4  0x00007fb632392cd2 in (anonymous namespace)::TExceptionHandlerImp::HandleException(int) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/lib/libcppyy_backend3_9.so
#5  0x00007fb6321e4111 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/lib/libCore.so
#6  <signal handler called>
#7  0x000000000787db50 in ?? ()
#8  0x00007fb619b604a5 in PyROOT::TMemoryRegulator::ClearProxiedObjects() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#9  0x00007fb619b5bdc8 in PyROOT::ClearProxiedObjects(_object*, _object*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#10 0x00007fb639cddbc7 in cfunction_vectorcall_NOARGS (func=0x7fb619bbf9f0, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/methodobject.c:485
#11 0x00007fb639c54591 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0, tstate=0x1ad7430) at ./Include/cpython/abstract.h:118
#12 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0) at ./Include/cpython/abstract.h:127
#13 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1ad7430) at Python/ceval.c:5072
#14 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3487
#15 0x00007fb639c4cbbb in _PyEval_EvalFrame (throwflag=0, f=0x7fb617bf11f0, tstate=0x1ad7430) at ./Include/internal/pycore_ceval.h:40
#16 function_code_fastcall (tstate=0x1ad7430, co=<optimized out>, args=<optimized out>, nargs=0, globals=<optimized out>) at Objects/call.c:330
#17 0x00007fb639e1caf7 in atexit_callfuncs (module=<optimized out>) at ./Modules/atexitmodule.c:93
#18 0x00007fb639db7ad0 in call_py_exitfuncs (tstate=0x1ad7430) at Python/pylifecycle.c:2374
#19 Py_FinalizeEx () at Python/pylifecycle.c:1373
#20 Py_FinalizeEx () at Python/pylifecycle.c:1345
#21 0x00007fb639dda365 in Py_RunMain () at Modules/main.c:679
#22 0x00007fb639ddaab3 in pymain_main (args=0x7ffd64a55710) at Modules/main.c:707
#23 Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731
#24 0x00007fb638cdf555 in __libc_start_main () from /lib64/libc.so.6
#25 0x000000000040108e in _start ()
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum https://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at https://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#7  0x000000000787db50 in ?? ()
#8  0x00007fb619b604a5 in PyROOT::TMemoryRegulator::ClearProxiedObjects() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#9  0x00007fb619b5bdc8 in PyROOT::ClearProxiedObjects(_object*, _object*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#10 0x00007fb639cddbc7 in cfunction_vectorcall_NOARGS (func=0x7fb619bbf9f0, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/methodobject.c:485
#11 0x00007fb639c54591 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0, tstate=0x1ad7430) at ./Include/cpython/abstract.h:118
#12 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0) at ./Include/cpython/abstract.h:127
#13 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1ad7430) at Python/ceval.c:5072
#14 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3487
#15 0x00007fb639c4cbbb in _PyEval_EvalFrame (throwflag=0, f=0x7fb617bf11f0, tstate=0x1ad7430) at ./Include/internal/pycore_ceval.h:40
#16 function_code_fastcall (tstate=0x1ad7430, co=<optimized out>, args=<optimized out>, nargs=0, globals=<optimized out>) at Objects/call.c:330
#17 0x00007fb639e1caf7 in atexit_callfuncs (module=<optimized out>) at ./Modules/atexitmodule.c:93
#18 0x00007fb639db7ad0 in call_py_exitfuncs (tstate=0x1ad7430) at Python/pylifecycle.c:2374
#19 Py_FinalizeEx () at Python/pylifecycle.c:1373
#20 Py_FinalizeEx () at Python/pylifecycle.c:1345
#21 0x00007fb639dda365 in Py_RunMain () at Modules/main.c:679
#22 0x00007fb639ddaab3 in pymain_main (args=0x7ffd64a55710) at Modules/main.c:707
#23 Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731
#24 0x00007fb638cdf555 in __libc_start_main () from /lib64/libc.so.6
#25 0x000000000040108e in _start ()
===========================================================


 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007fb638d82659 in waitpid () from /lib64/libc.so.6
#1  0x00007fb638cfff62 in do_system () from /lib64/libc.so.6
#2  0x00007fb638d00311 in system () from /lib64/libc.so.6
#3  0x00007fb6321e6ceb in TUnixSystem::StackTrace() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/lib/libCore.so
#4  0x00007fb632392b5a in (anonymous namespace)::TExceptionHandlerImp::HandleException(int) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/lib/libcppyy_backend3_9.so
#5  0x00007fb6321e4111 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/cms/cmssw/CMSSW_12_4_8/external/slc7_amd64_gcc10/lib/libCore.so
#6  <signal handler called>
#7  0x000000000787db50 in ?? ()
#8  0x00007fb619b604a5 in PyROOT::TMemoryRegulator::ClearProxiedObjects() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#9  0x00007fb619b5bdc8 in PyROOT::ClearProxiedObjects(_object*, _object*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#10 0x00007fb639cddbc7 in cfunction_vectorcall_NOARGS (func=0x7fb619bbf9f0, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/methodobject.c:485
#11 0x00007fb639c54591 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0, tstate=0x1ad7430) at ./Include/cpython/abstract.h:118
#12 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0) at ./Include/cpython/abstract.h:127
#13 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1ad7430) at Python/ceval.c:5072
#14 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3487
#15 0x00007fb639c4cbbb in _PyEval_EvalFrame (throwflag=0, f=0x7fb617bf11f0, tstate=0x1ad7430) at ./Include/internal/pycore_ceval.h:40
#16 function_code_fastcall (tstate=0x1ad7430, co=<optimized out>, args=<optimized out>, nargs=0, globals=<optimized out>) at Objects/call.c:330
#17 0x00007fb639e1caf7 in atexit_callfuncs (module=<optimized out>) at ./Modules/atexitmodule.c:93
#18 0x00007fb639db7ad0 in call_py_exitfuncs (tstate=0x1ad7430) at Python/pylifecycle.c:2374
#19 Py_FinalizeEx () at Python/pylifecycle.c:1373
#20 Py_FinalizeEx () at Python/pylifecycle.c:1345
#21 0x00007fb639dda365 in Py_RunMain () at Modules/main.c:679
#22 0x00007fb639ddaab3 in pymain_main (args=0x7ffd64a55710) at Modules/main.c:707
#23 Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731
#24 0x00007fb638cdf555 in __libc_start_main () from /lib64/libc.so.6
#25 0x000000000040108e in _start ()
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum https://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at https://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#7  0x000000000787db50 in ?? ()
#8  0x00007fb619b604a5 in PyROOT::TMemoryRegulator::ClearProxiedObjects() () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#9  0x00007fb619b5bdc8 in PyROOT::ClearProxiedObjects(_object*, _object*) () from /cvmfs/cms.cern.ch/slc7_amd64_gcc10/lcg/root/6.24.07-bf41b0420bc269850b74e23486e2953a/lib/libROOTPythonizations3_9.so
#10 0x00007fb639cddbc7 in cfunction_vectorcall_NOARGS (func=0x7fb619bbf9f0, args=<optimized out>, nargsf=<optimized out>, kwnames=<optimized out>) at Objects/methodobject.c:485
#11 0x00007fb639c54591 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0, tstate=0x1ad7430) at ./Include/cpython/abstract.h:118
#12 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x7fb617bf1380, callable=0x7fb619bbf9f0) at ./Include/cpython/abstract.h:127
#13 call_function (kwnames=0x0, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=0x1ad7430) at Python/ceval.c:5072
#14 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=<optimized out>, throwflag=<optimized out>) at Python/ceval.c:3487
#15 0x00007fb639c4cbbb in _PyEval_EvalFrame (throwflag=0, f=0x7fb617bf11f0, tstate=0x1ad7430) at ./Include/internal/pycore_ceval.h:40
#16 function_code_fastcall (tstate=0x1ad7430, co=<optimized out>, args=<optimized out>, nargs=0, globals=<optimized out>) at Objects/call.c:330
#17 0x00007fb639e1caf7 in atexit_callfuncs (module=<optimized out>) at ./Modules/atexitmodule.c:93
#18 0x00007fb639db7ad0 in call_py_exitfuncs (tstate=0x1ad7430) at Python/pylifecycle.c:2374
#19 Py_FinalizeEx () at Python/pylifecycle.c:1373
#20 Py_FinalizeEx () at Python/pylifecycle.c:1345
#21 0x00007fb639dda365 in Py_RunMain () at Modules/main.c:679
#22 0x00007fb639ddaab3 in pymain_main (args=0x7ffd64a55710) at Modules/main.c:707
#23 Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at Modules/main.c:731
#24 0x00007fb638cdf555 in __libc_start_main () from /lib64/libc.so.6
#25 0x000000000040108e in _start ()
===========================================================



Is there a simple way to work around this in the newer ROOT/python version?

Thanks!

Dear @IzaakWN ,

I am still not sure why you need to manually delete the histograms. If they survive longer than needed for some reason, then I guess that’s the problem you want to address. Maybe you could make use of a separate function that manages the hists list, doing whatever you need to do with the histograms, so that at the end of the function the list is garbage collected and proper destruction happens.

As your code stands, you are doing something quite dangerous. calling TObject::Delete means that you will leave the Python object thinking that it still has something inside, while you really have just destroyed the actual C++ instance. Please, avoid using TObject::Delete at all costs in any of your applications, I can’t really see any practical use for it.

One other thing that you might not be aware of, is that many objects in ROOT (including histograms) are kept around until the end of the program in a global list of references, which prevents their destruction. For histograms, TH1::AddDirectory(false) prevents the addition of all histograms in your application to the list.

Cheers,
Vincenzo

Hi @vpadulan,

Thank you for the feedback and information.

I added manual deletion of histograms at some point because my code was creating a large number of histograms (for various variable and selections from trees of various data & MC samples), which clogged the memory after a while. Deleting the unneeded histograms (after TH1::Add) helped. Besides hist.Delete(), I also use python deletion del hist. Later I did find out opening and closing the ROOT files also helped to remove unneeded histograms.

Cheers,
Izaak

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.