Segfault on application termination: TClass::SetUnloaded

marc1uk · July 8, 2021, 11:01pm

ROOT Version: 5.28
Platform: CentOS7
Compiler: 4.8.5

Having recently finished off some code that uses a TInterpreter to perform some runtime actions, everything seemed to be fine. But, after tidying up my code to remove all the debug chaff and shifting it to a new class, the code is still working, but the application now crashes on termination with:

Program received signal SIGSEGV, Segmentation fault.
0x00007fffe7c7ae91 in TClass::SetUnloaded() () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/libCore.so
Missing separate debuginfos, use: debuginfo-install bzip2-libs-1.0.6-13.el7.x86_64 freetype-2.8-14.el7.x86_64 glibc-2.17-307.el7.1.x86_64 libX11-1.6.7-2.el7.x86_64 libXau-1.0.8-2.1.el7.x86_64 libgcc-4.8.5-39.el7.x86_64 libgfortran-4.8.5-39.el7.x86_64 libpng-1.5.13-7.el7_2.x86_64 libquadmath-4.8.5-39.el7.x86_64 libstdc++-4.8.5-39.el7.x86_64 libxcb-1.13-1.el7.x86_64 ncurses-libs-5.9-14.20130511.el7_4.x86_64 nss-softokn-freebl-3.44.0-8.el7_7.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-18.el7.x86_64
(gdb) bt
#0  0x00007fffe7c7ae91 in TClass::SetUnloaded() () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/libCore.so
#1  0x00007fffe7c50aca in ROOT::RemoveClass(char const*) () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/libCore.so
#2  0x00007fffe7c51f30 in ROOT::TGenericClassInfo::~TGenericClassInfo() ()
   from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/libCore.so
#3  0x00007fffe08afce9 in __run_exit_handlers () from /lib64/libc.so.6
#4  0x00007fffe08afd37 in exit () from /lib64/libc.so.6
#5  0x00007fffe089855c in __libc_start_main () from /lib64/libc.so.6
#6  0x0000000000404d53 in _start ()

I can reproduce this with my development code if I delete my TInterpreter before the application closes (I had accidentally left it to leak before now). With the same code in a new (functionally identical) class I now get a crash even if I don’t delete my TInterpreter.

In a nutshell the application here is firing up a TInterpreter (TCint), loading a shared library that defines a class, and invoking some class methods.
Are there any obvious places to start? (Aside from recompiling ROOT in debug mode…)

pcanal · July 9, 2021, 12:23am

Why do you create you own TInterpreter, isn’t the default one sufficient (via gInterpreter)?

marc1uk · July 9, 2021, 8:49am

My interpreted code invokes a templated method where the template class is only known at runtime. That means the specific template instantiation may not be present in the dictionary currently loaded by the interpreter. So, if necessary, I update the Linkdef file, rebuild the dictionary, and reload the library containing that dictionary, all at runtime. I tried using

gInterpreter->UnloadFile("libMyClass_RootDict.so");
gInterpreter->Load("libMyClass_RootDict.so");

but it didn’t seem to pick up the changes. I also tried

gInterpreter->Reset();
gInterpreter->Load("libMyClass_RootDict.so");

but that segfaulted when I tried to use the class. Only by using my own TInterpreter instance and doing

if(meInterpreter) delete meInterpreter;
meInterpreter = new TCint("myInterpreter","Good Interpreter");
meInterpreter->Load("libMyClass_RootDict.so");

did it work as expected.

pcanal · July 9, 2021, 4:22pm

humm odd …

gInterpreter->Reset();

That is likely to be too harsh (remove too much).

Only by using my own TInterpreter instance and doing

I don’t recall that we really tested this in a while (in v5.34) …

So, if necessary, I update the Linkdef file, rebuild the dictionary,

Instead you could

write a ‘new’ linkdef file containing ‘only’ the pragma for the new function template instances. (note: don’t add a pragma for anything (including the class) that already has a (loaded) dictionary.
compile and link under a ‘unique’ library name
load that library.

and it should work. (If you want to make the change ‘permanent’, I guess you could also update the original LinkDef so that it is pick up/still used the next time you run.

marc1uk · July 11, 2021, 12:15pm

Even using the gInterpreter I’m getting the same behaviour.

pcanal · July 12, 2021, 4:57pm

Even when no longer unloading but using “incremental” dictionary libraries?

marc1uk · July 13, 2021, 2:58pm

Yep. Reloading is only needed when a new class is encountered at runtime - I just made sure no new classes were encountered and replaced the

meInterpreter = new TCint("meInterpreter", "title");

with

meInterpreter = (TCint*)gInterpreter;

So all its actually doing at this point is:

gInterpreter->Load("myLib.so")
gInterpreter->ProcessLine("some c stuff...")

i’m a bit busy at the moment, need to make some progress elsewhere, but i’ll try to investigate further into exactly what it’s doing now that triggers the segfault on termination.

jalopezg · July 13, 2021, 4:27pm

Hi @marc1uk,

Given that, and maybe I’m saying the obvious, just be sure to not issue a delete meInterpreter; if it is pointing to the global interpreter instance.

Cheers,
J.

marc1uk · July 13, 2021, 5:17pm

yes, I’m not deleting the gInterpreter.

marc1uk · July 24, 2021, 2:09pm

OK, I came back to this today to try to narrow down what might have been causing the problems, and it seems that having made no changes (I wasn’t even working on this code repository) I’m not getting this error any more - either with the gInterpreter, with my own instance of TCint, with or without a single or multiple instances of dictionary rebuilding & reloading. Everything works like a charm.
So I’ll happily call this closed.
Thanks again to all who offered suggestions and help.

pcanal · July 24, 2021, 7:53pm

This may just be another incarnation of the random/arbitrary behavior (i.e. it might just be hiding out of luck and might reappear later). You can run your (previously failing) example under valgrind (valgrind --suppressions=$ROOTSYS/etc/valgrind-root.supp --leak-check=no your_execuable your_arguments) to see if it detect any undefined behavior (like use after deletion)

marc1uk · August 2, 2021, 12:44pm

Good idea, I’ll give that a go shortly. For the time being, i have yet another thread open on a new topic…!

system · August 16, 2021, 12:45pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.