Intermittent shutdown memory crash in compiled ROOT app (bus error / segfault / malloc smallbin corruption) with minimal reproducer

Hello ROOT team,

I am debugging a compiled C++ ROOT application (not a macro). The analysis runs, finishes all processing and produces the outputs as expected, and only crashes at shutdown/exit.

Observed errors are intermittent, but there is always one of these three errors:

  • bus error
  • segmentation fault
  • malloc(): smallbin double linked list corrupted

I made a reduced reproducer that keeps the same ownership/lifetime pattern and drawing structure, but removes the large loops and removes custom classes.
Reproducer file: ReproduceBug.cxx

ReproduceBug.cxx (5.4 KB)

What the reproducer does:

  • opens a ROOT file and gets objects
  • calls SetDirectory(nullptr) on histograms
  • reads POT-like scaling object
  • closes/deletes the input TFile early
  • continues with projections/clones/canvas/pad/legend/latex drawing
  • no output files written
  • no heavy loops

The reproducer can run also with no inputs, and it reproduces the Segmentation fault in both cases.

Could you help identify which ownership/cleanup pattern is most likely causing shutdown-time crashes?
Additionally, related to this, I have some general-good-behavior questions:

  1. Is closing/deleting TFile in the code safe with this object usage pattern? Or should one just let ROOT delete objects at the end of the exe?
  2. Are there known pitfalls with histogram clones + canvas/pad primitive ownership during ROOT teardown?
  3. Is there a recommended teardown order in compiled ROOT apps to avoid this type of late heap corruption?
  4. Which debugging path do you recommend first for these cases (ASan, valgrind, gdb with ROOT symbols), and what to inspect first?

If useful, I can also share:

  • exact stack trace from gdb
  • ASan or valgrind output from the reproducer.

Thanks in advance.


I am using root from cvmfs:
source /cvmfs/sft.cern.ch/lcg/views/LCG_105/x86_64-el9-gcc11-opt/setup.sh

ROOT Version: 6.30/02
Platform: linuxx8664gcc
Compiler: g++ (GCC) 11.3.0


Either:
// delete p_inner;
or:
delete p_inner; delete c_main;

Dear @Lorenzo ,

Indeed @Wile_E_Coyote is right , you simply got the wrong order of destruction of your custom-managed objects allocated on the heap.

As a general advice, I would discourage you from using such patterns and try to use stack-based variables as much as possible or always use std::unique_ptr for heap-allocated objects.

Cheers,
Vincenzo