Memory corruption in TPad for 5.28

I have written a pyROOT class that is crashing in 5.28, while problems did not occur in 5.26 (the python version for both is the same, 2.6.5). The code is somewhat long, so I’ll include pseudocode of the full version, and the exact code of a simplified version I tried to create to replicate the crash.

def savePictures():
    plots = othermodule.plot_hists_shared_axis(data,mc, 'stack title')
    plots['canvas'].cd()
    plots['canvas'].SaveAs(picPath+'stack.pdf')
return

othermodule:
def plot_hists_shared_axis(data, mc, name, **kw):
    c1 = TCanvas(...)
    c1.cd()
    data.Draw()
    c2 = TCanvas(...)
    c2.cd()
    mc.Draw()
    outcanvas = plot_canvas_shared_axis(c1,c2,'other title')
    return {'canvas':outcanvas, 'other_stuff':otherstuff}

def plot_canvas_shared_axis(top_canvas,bottom_canvas,title)
    canvas = TCanvas()
    canvas.cd()
    top_pad = TPad(...)
    top_pad.Draw()
    bottom_pad = TPad(...)
    bottom_pad.Draw()
    top_pad.cd()
    top_canvas.DrawClonePad()
    bottom_pad.cd()
    bottom_canvas.DrawClonePad()
    return canvas

This “program” will not crash in 5.26, but crashes in 5.28, producing the following stacktrace:

 *** Break *** segmentation violation
*** glibc detected *** python: corrupted double-linked list: 0x0b78ee28 ***
======= Backtrace: =========
/lib/libc.so.6[0xb59b18]
/lib/libc.so.6[0xb5be37]
/lib/libc.so.6(__libc_malloc+0x67)[0xb5dfb7]
/afs/cern.ch/atlas/software/releases/17.0.2/gcc-alt-435/x86_64-slc5-gcc43-opt/lib/libstdc++.so.6(_Znwj+0x27)[0xf5be7217]
/afs/cern.ch/atlas/software/releases/17.0.2/gcc-alt-435/x86_64-slc5-gcc43-opt/lib/libstdc++.so.6(_Znaj+0x1d)[0xf5be734d]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN7TRegexp10GenPatternEPKc+0x26)[0xf750dc96]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN7TRegexpC1EPKcb+0x3a)[0xf750e09a]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN7TSystem10FindHelperEPKcPv+0xe1)[0xf7530e21]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN11TUnixSystem14AccessPathNameEPKc11EAccessMode+0x34)[0xf75bbb04]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN11TUnixSystem10StackTraceEv+0x663)[0xf75c1573]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN11TUnixSystem15DispatchSignalsE8ESignals+0xd7)[0xf75c0de7]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so[0xf75c0efd]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so[0xf75b8252]
[0xffffe500]
[0xb78f32f]
/afs/cern.ch/atlas/software/releases/17.0.2/LCGCMT/LCGCMT_60c/InstallArea/i686-slc5-gcc43-opt/lib/libCore.so(_ZN6TClass10DestructorEPvb+0x62)[0xf7586a92]
/afs/cern.ch/atlas/software/releases/17.0.2/sw/lcg/app/releases/ROOT/5.28.00e/i686-slc5-gcc43-opt/root/lib/libPyROOT.so(_ZN6PyROOT17op_dealloc_nofreeEPNS_11ObjectProxyE+0x48)[0xf7b48058]
/afs/cern.ch/atlas/software/releases/17.0.2/sw/lcg/app/releases/ROOT/5.28.00e/i686-slc5-gcc43-opt/root/lib/libPyROOT.so[0xf7b480a2]
python[0x80a9c70]
python(PyDict_Clear+0x144)[0x808aff4]
python[0x808b04d]
python[0x810b78e]
python(PyGC_Collect+0x32)[0x810c092]
python(Py_Finalize+0x121)[0x80fc5f1]
python(Py_Main+0x526)[0x8058616]
python(main+0x32)[0x8057e42]
/lib/libc.so.6(__libc_start_main+0xdc)[0xb07e9c]
python[0x8057d51]

So something goes wrong when the TPad(?) is deallocated. I tried to recreate the crash in a minimal python script, and I made the following:

test.py

import ROOT

def func():
    can=ROOT.TCanvas()
    can.Draw()
    can.cd()

    pad = ROOT.TPad()
    pad.Draw()
    pad.cd()

func()

I noticed that you can do this in a CINT macro without any problems, whether you put the pad/canvas on the stack or the heap. However, I noticed that this script crashes pyroot with memory errors in deallocation on both 5.26 and 5.28, so now I am a bit confused, and I’d appreciate some feedback. I’d be happy to post more details from my own full program, or more stack traces (you seem to get a different kind of memory error each time) if it would be helpful.

Thanks,
Kevin

Hi,

My best guess (without looking too carefully at the pseudocode) is that the python side is losing all references to one (or more) of the Canvas and thus PyROOT deletes it/them in the context of the garbage collection. Either make sure to keep a (live) reference to all canvases and/or disable the PyROOT ownership tracking for those entities.

Cheers,
Philippe.

Hi,

yes, what Philippe prescribed:ROOT.SetOwnership(can, False) ROOT.SetOwnership(pad, False)
Note that you don’t have to worry about leaks in this particular case: “pad” is handed to “can” underneath in the C++ code, and canvases are cleaned up on program exit or when you close them through their GUI.

Cheers,
Wim

Thanks for the reply!

This bug is still there in the latest Pro version of ROOT 6.

Rene,

not following … which bug? The problem described is where there is ambiguity about the object ownership and that will require user intervention. You could try rootpy.org, though, as they capture several common cases and may have this one handled as well.

Cheers,
Wim