TMVA::Reader segfault when on heap

ROOT Version: 5.28
Platform: CentOS7
Compiler: 4.8.5

Hi ROOTers.
I’m trying to pull some code that uses a TMVA::Reader into our framework, but for some reason code that’s working externally seems to be segmentation faulting in our framework.
After a bunch of fiddling it seems that if the TMVA::Reader class is on the heap, it segfaults, either on construction (if no Option argument is given) or on invoking the BookMVA method if one is given. If the TMVA::Reader is on the stack, it behaves just fine.
Unfortunately in our framework it’s not really practical for the reader to be on the stack.

Any ideas on why this might be, or how to work around it?
The segfault on construction is:

There was a crash.
This is the entire stack trace of all threads:
#0  0x00007f11390bc46c in waitpid () from /lib64/
#1  0x00007f1139039f62 in do_system () from /lib64/
#2  0x00007f113eafb216 in TUnixSystem::StackTrace() () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/
#3  0x00007f113eafcc4c in TUnixSystem::DispatchSignals(ESignals) () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/
#4  <signal handler called>
#5  0x00007f113914d346 in __memcpy_ssse3_back () from /lib64/
#6  0x00007f113eaa457b in TString::Clone() () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/
#7  0x00007f113eaa460d in TString::ToLower() () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/
#8  0x00007f1139f674bd in TMVA::Tools::CheckForSilentOption (this=<optimized out>, cs=...) at src/Tools.cxx:662
#9  0x00007f1139fc7d0d in TMVA::Reader::DeclareOptions (this=this
entry=0x20b1ce8) at src/Reader.cxx:264
#10 0x00007f1139fc9afc in TMVA::Reader::Reader (this=0x20b1ce8, theOption=..., verbose=<optimized out>) at src/Reader.cxx:141
#11 0x0000000000409e39 in myclass::myclass (this=0x20b1ca0) at main.cpp:8
#12 0x0000000000409d0f in main () at main.cpp:92

and the segfault on BookMVA is:

There was a crash.
This is the entire stack trace of all threads:
#0  0x00007ffa5b47946c in waitpid () from /lib64/
#1  0x00007ffa5b3f6f62 in do_system () from /lib64/
#2  0x00007ffa60eb8216 in TUnixSystem::StackTrace() () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/
#3  0x00007ffa60eb9c4c in TUnixSystem::DispatchSignals(ESignals) () from /home/skofl/sklib_gcc4.8.5/root_v5.28.00h/lib/
#4  <signal handler called>
#5  0x00007ffa5bf4bfe9 in std::ostream::sentry::sentry(std::ostream&) () from /lib64/
#6  0x00007ffa5bf4c719 in std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long) () from /lib64/
#7  0x00007ffa5c3866c0 in operator<< <std::char_traits<char> > (__s=0x7ffa5c5c42e3 "Booking \"", __out=...) at /usr/include/c++/4.8.2/ostream:535
#8  operator<< <char const*> (arg=0x7ffa5c5c42e3 "Booking \"", this=0xc41adf) at ./TMVA/MsgLogger.h:96
#9  TMVA::Reader::BookMVA (this=0xc40cd0, methodTag=..., weightfile=...) at src/Reader.cxx:369
#10 0x000000000040a2b8 in myclass::Init (this=0xc3d9c0) at main.cpp:36
#11 0x00000000004098df in main () at main.cpp:93

I’ve attached some minimal reproducer code. Things to try:

  1. construct myclass on the stack with a stack-allocated member TMVA::Reader (current). It works.
  2. construct myclass on the heap (comment out first myclass line, uncomment second one. It now segfaults.
  3. and/or comment out the TMVA::Reader member and construct tmvaReader in the Init method. It segfaults.
    demo.tgz (2.6 MB)

Note: the model file puts the upload juuust over the 3MB limit, unless I compress it with lzma2 instead of gzip. So the file extension uploaded is .tgz but the archive is actually a .tar.xz :shushing_face:

I have no problem running your code using a more recent ROOT version (e…g. 6.24).
I see you are using a very old version of ROOT, 5.28, I would suggest you to upgrade it if you can



Hi Lorenzo. That would be the easy solution to many of my ROOT troubles, I agree. :pensive:

Any other suggestions? After trying to implement some workarounds where it actually is in the stack, it seems even that isn’t always reliable and will in segfault again with the same error. Is there some sort of memory address limitation?

I did a little digging into the source of the error. The segfault was coming from the TMVA::Reader::BookMVA method - specifically a call to fLogger << Endl (where fLogger is a MsgLogger member object of the Reader).
Internally fLogger << Endl just calls MsgLogger::Send(), which in turn calls MsgLogger::GetFormattedSource(), which was segmentation faulting at the following lines:

   std::string source_name;
   if (fObjSource) source_name = fObjSource->GetName();
   else            source_name = fStrSource;

It seems as though fObjSource was neither null nor valid, somehow :man_shrugging:.
I tried to track the instantiation of fLogger by printing out its memory address on construction within the TMVA::Reader constructor, and then again after each subsequent line within the constructor. Somehow, after calling SetConfigName(GetName()), the address changes.
That seems really weird to me as GetName just returns a constant string (“Reader”), and SetConfigName just sets a string member of the parent Configurable class to the given string. None of it involves logging or any other class members at all, so it seems very fishy that the fLogger address should be changed by that line…
In the end, I was able to workaround the segfault by grabbing a copy of fLogger before calling SetConfigName(GetName()) and then restoring it afterwards.
This feels pretty hacky to me, and it still segfaults on construction (i.e. even before the call to BookMVA) if i construct the TMVA::Reader on the stack within my application (although not within the minimal reproducer).

It seems to me as though something is corrupting the memory within the TMVA::Reader class, which leaves me pretty wary of its outputs…

Unfortunately no, I’ve just used the workaround in my last post. So far the results seem to be as expected, but it hasn’t really been used in earnest.

Are you also using 5.34 ? This issue is not present with the recent ROOT version and upgrading to a new version is the recommended way of solving this problem.

Did you have a fix on this issue? I am facing the same issue last time but no response from anyone and couldn’t find the topic troubleshooting in google.