Tracking PyRoot calls in Valgrind

Hi all,

I am debugging a quite “nasty” double free in a project that use PyROOT.

I reached for Valgrind, which is helping, but not enough.

What I see is something like this:

==24168== Invalid read of size 1
==24168==    at 0x118734C9: THashList::RecursiveRemove(TObject*) (THashList.cxx:353)
==24168==    by 0x11742072: TROOT::RecursiveRemove(TObject*) (TROOT.cxx:2484)
==24168==    by 0x11804D7A: CallRecursiveRemoveIfNeeded (TROOT.h:404)
==24168==    by 0x11804D7A: TNamed::~TNamed() (TNamed.cxx:45)
==24168==    by 0x105AA758: TMemFile::~TMemFile() (TMemFile.cxx:226)
==24168==    by 0x1187DC7E: TList::Delete(char const*) (TList.cxx:505)
==24168==    by 0x11742294: TROOT::~TROOT() (TROOT.cxx:937)
==24168==    by 0x61D9C28: __run_exit_handlers (in /usr/lib64/libc-2.17.so)
==24168==    by 0x61D9C76: exit (in /usr/lib64/libc-2.17.so)
==24168==    by 0x11912F6C: TUnixSystem::Exit(int, bool) (TUnixSystem.cxx:2160)
==24168==    by 0x119185B1: TUnixSystem::DispatchSignals(ESignals) (TUnixSystem.cxx:3647)
==24168==    by 0x525B5CF: ??? (in /usr/lib64/libpthread-2.17.so)
==24168==    by 0x3C98F730: genfit::Track::Clear(char const*) (Track.cc:187)
==24168==    by 0x3C98F819: genfit::Track::~Track() (Track.cc:173)
==24168==    by 0x3C98F938: genfit::Track::~Track() (Track.cc:174)
==24168==    by 0x118B9560: TClass::Destructor(void*, bool) (TClass.cxx:5152)
==24168==    by 0x1186D72D: R__ReleaseMemory (TClonesArray.cxx:143)
==24168==    by 0x1186D72D: TClonesArray::~TClonesArray() (TClonesArray.cxx:256)
==24168==    by 0x1186D7C8: TClonesArray::~TClonesArray() (TClonesArray.cxx:264)
==24168==    by 0x118B9560: TClass::Destructor(void*, bool) (TClass.cxx:5152)
==24168==    by 0xD5213A6: PyROOT::op_dealloc_nofree(PyROOT::ObjectProxy*) (ObjectProxy.cxx:59)
==24168==    by 0xD521418: PyROOT::(anonymous namespace)::op_dealloc(PyROOT::ObjectProxy*) (ObjectProxy.cxx:212)
==24168==    by 0x4EF54A1: subtype_dealloc (typeobject.c:1030)
==24168==    by 0x4ED4D4A: dict_dealloc (dictobject.c:1010)
==24168==    by 0x4E9FB29: instance_dealloc (classobject.c:681)
==24168==    by 0x4ED5736: insertdict_by_entry (dictobject.c:519)
==24168==    by 0x4ED5736: insertdict (dictobject.c:556)
==24168==    by 0x4ED7116: dict_set_item_by_hash_or_entry (dictobject.c:765)
==24168==    by 0x4ED7116: PyDict_SetItem (dictobject.c:818)
==24168==    by 0x4EDB22B: _PyModule_Clear (moduleobject.c:139)
==24168==    by 0x4F5876A: PyImport_Cleanup (import.c:477)
==24168==    by 0x4F6AC6D: Py_Finalize (pythonrun.c:459)
==24168==    by 0x4F8102B: Py_Main (main.c:665)
==24168==    by 0x61C2494: (below main) (in /usr/lib64/libc-2.17.so)
==24168==  Address 0x40b9d75f is 15 bytes inside a block of size 768 free'd
==24168==    at 0x4C2B16D: operator delete(void*) (vg_replace_malloc.c:576)
==24168==    by 0x41927A75: FairRootManager::~FairRootManager() (FairRootManager.cxx:167)
==24168==    by 0x11CEC5A5: (anonymous namespace)::run(void*) (atexit_thread.cc:71)
==24168==    by 0x61D9C28: __run_exit_handlers (in /usr/lib64/libc-2.17.so)
==24168==    by 0x61D9C76: exit (in /usr/lib64/libc-2.17.so)
==24168==    by 0x11912F6C: TUnixSystem::Exit(int, bool) (TUnixSystem.cxx:2160)
==24168==    by 0x119185B1: TUnixSystem::DispatchSignals(ESignals) (TUnixSystem.cxx:3647)
==24168==    by 0x525B5CF: ??? (in /usr/lib64/libpthread-2.17.so)
==24168==    by 0x3C98F730: genfit::Track::Clear(char const*) (Track.cc:187)
==24168==    by 0x3C98F819: genfit::Track::~Track() (Track.cc:173)
==24168==    by 0x3C98F938: genfit::Track::~Track() (Track.cc:174)
==24168==    by 0x118B9560: TClass::Destructor(void*, bool) (TClass.cxx:5152)
==24168==    by 0x1186D72D: R__ReleaseMemory (TClonesArray.cxx:143)
==24168==    by 0x1186D72D: TClonesArray::~TClonesArray() (TClonesArray.cxx:256)
==24168==    by 0x1186D7C8: TClonesArray::~TClonesArray() (TClonesArray.cxx:264)
==24168==    by 0x118B9560: TClass::Destructor(void*, bool) (TClass.cxx:5152)
==24168==    by 0xD5213A6: PyROOT::op_dealloc_nofree(PyROOT::ObjectProxy*) (ObjectProxy.cxx:59)
==24168==    by 0xD521418: PyROOT::(anonymous namespace)::op_dealloc(PyROOT::ObjectProxy*) (ObjectProxy.cxx:212)
==24168==    by 0x4EF54A1: subtype_dealloc (typeobject.c:1030)
==24168==    by 0x4ED4D4A: dict_dealloc (dictobject.c:1010)
==24168==    by 0x4E9FB29: instance_dealloc (classobject.c:681)
==24168==    by 0x4ED5736: insertdict_by_entry (dictobject.c:519)
==24168==    by 0x4ED5736: insertdict (dictobject.c:556)
==24168==    by 0x4ED7116: dict_set_item_by_hash_or_entry (dictobject.c:765)
==24168==    by 0x4ED7116: PyDict_SetItem (dictobject.c:818)
==24168==    by 0x4EDB22B: _PyModule_Clear (moduleobject.c:139)
==24168==    by 0x4F5876A: PyImport_Cleanup (import.c:477)
==24168==    by 0x4F6AC6D: Py_Finalize (pythonrun.c:459)
==24168==    by 0x4F8102B: Py_Main (main.c:665)
==24168==    by 0x61C2494: (below main) (in /usr/lib64/libc-2.17.so)
==24168==  Block was alloc'd at
==24168==    at 0x4C2A1E3: operator new(unsigned long) (vg_replace_malloc.c:334)
==24168==    by 0x118202F8: TStorage::ObjectAlloc(unsigned long) (TStorage.cxx:330)
==24168==    by 0x4192718F: operator new (TObject.h:152)
==24168==    by 0x4192718F: FairRootManager::FairRootManager() (FairRootManager.cxx:121)
==24168==    by 0x4192797C: FairRootManager::Instance() (FairRootManager.cxx:87)
==24168==    by 0x4192CD47: FairRun::FairRun(bool) (FairRun.cxx:59)
==24168==    by 0x4193B159: FairRunSim::FairRunSim(bool) (FairRunSim.cxx:77)
==24168==    by 0x45C8C066: ???
==24168==    by 0xD513B60: FastCall(long, void*, void*, void*) (Cppyy.cxx:407)
==24168==    by 0xD51499B: Cppyy::CallConstructor(long, long, void*) (Cppyy.cxx:487)
==24168==    by 0xD516FC2: GILCallConstructor (Executors.cxx:85)
==24168==    by 0xD516FC2: PyROOT::TConstructorExecutor::Execute(long, void*, PyROOT::TCallContext*) (Executors.cxx:623)
==24168==    by 0xD53A101: CallFast (TMethodHolder.cxx:69)
==24168==    by 0xD53A101: PyROOT::TMethodHolder::CallSafe(void*, long, PyROOT::TCallContext*) (TMethodHolder.cxx:121)
==24168==    by 0xD539B88: PyROOT::TMethodHolder::Execute(void*, long, PyROOT::TCallContext*) (TMethodHolder.cxx:528)
==24168==    by 0xD53563F: PyROOT::TConstructorHolder::Call(PyROOT::ObjectProxy*&, _object*, _object*, PyROOT::TCallContext*) (TConstructorHolder.cxx:65)
==24168==    by 0xD51DC58: PyROOT::(anonymous namespace)::mp_call(PyROOT::MethodProxy*, _object*, _object*) (MethodProxy.cxx:597)
==24168==    by 0x4E8EB12: PyObject_Call (abstract.c:2529)
==24168==    by 0x4EFB9B1: slot_tp_init (typeobject.c:5709)
==24168==    by 0x4EFA33D: type_call (typeobject.c:745)
==24168==    by 0x4E8EB12: PyObject_Call (abstract.c:2529)
==24168==    by 0x4F426EE: do_call (ceval.c:4253)
==24168==    by 0x4F426EE: call_function (ceval.c:4058)
==24168==    by 0x4F426EE: PyEval_EvalFrameEx (ceval.c:2681)
==24168==    by 0x4F457AB: PyEval_EvalCodeEx (ceval.c:3267)
==24168==    by 0x4F458A8: PyEval_EvalCode (ceval.c:669)
==24168==    by 0x4F69EA9: run_mod (pythonrun.c:1371)
==24168==    by 0x4F69EA9: PyRun_FileExFlags (pythonrun.c:1357)
==24168==    by 0x4F6B284: PyRun_SimpleFileExFlags (pythonrun.c:949)
==24168==    by 0x4F816C0: Py_Main (main.c:640)
==24168==    by 0x61C2494: (below main) (in /usr/lib64/libc-2.17.so)

Where we can see, reading from the bottom up for each trace:

  1. A system part
  2. A python part that starts with Py_Main
  3. A C++ part (longer addresses)

From just this information I am not quite able to track what python code is been invoked.

I can see the precise C++ files and methods, but I don’t see the same information for the Python code.

Is there anything we can do here?

I am afraid the answer is “No”, but it is still worth to ask.

Many thanks,
Simone

Don’t know the status of gdb these days, but both Intel’s and Microsoft’s debuggers can trace natively through both Python and C/C++ code.

Anyway, you’re looking at a double delete during shutdown (cleanup of files during exit handling (b/c of _run_exit_handlers) and normal Python object removal). Look for a TClonesArray of Tracks that lives as a variable at the global level (b/c of _PyModuleClear) of a python module and was handed to it by FairRootManager (which constructed it, per valgrind).

It would surprise me, though, if Python would take ownership of the return value of a function, so more likely there’s an inadvert pointer copying of the Tracks.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

==24168==    by 0x105AA758: TMemFile::~TMemFile() (TMemFile.cxx:226)
==24168==    by 0x1187DC7E: TList::Delete(char const*) (TList.cxx:505)
==24168==    by 0x11742294: TROOT::~TROOT() (TROOT.cxx:937)

Somewhere FairRoot (or the end user code) is creating a TMemFile but is not explicitly deleted. Improving the code by adding the deletion of that TMemFile object at the adequate place should get rid of this problem.