Getting a useful crash dump when an exception is thrown

gwatts · March 3, 2018, 6:15pm

When I cause an exception in my code, I’d like to get a useful stack dump. Specifically, the line where the exception was thrown. For example, I use the vector::at method to access my vector arrays to make sure I don’t go past the end. In the middle of a very very long script I get an exception because I violate this. But I get no useful information other than the exception is thrown from root.

I’ve attached crash.cxx (196 Bytes) as an example of code that generates this kind of exception. The code is very simple:

int crash()
{
  vector<int> dude;
  dude.push_back(10);
  cout << "hi" << endl;
  cout << dude.at(10) << endl;
  return dude.size();
}

I then run root and get the error:

gwatts@FourByFour:~/root-binaries/testcrash$ root -b
   ------------------------------------------------------------
  | Welcome to ROOT 6.10/02                http://root.cern.ch |
  |                               (c) 1995-2017, The ROOT Team |
  | Built for linuxx8664gcc                                    |
  | From tag v6-10-02, 6 July 2017                             |
  | Try '.help', '.demo', '.license', '.credits', '.quit'/'.q' |
   ------------------------------------------------------------

root [0] .L crash.cxx++
Info in <TUnixSystem::ACLiC>: creating shared library /home/gwatts/root-binaries/testcrash/./crash_cxx.so
root [1] crash()
hi
Error in <TRint::HandleTermInput()>: std::out_of_range caught: vector::_M_range_check: __n (which is 10) >= this->size() (which is 1)
root [2]

When I get that error - I’d like to know what line number that came from - or something else useful about the site of the error. A stack dump would be amazing, since this is all complied code.

I’m running on Ubuntu, 6.10/02. Many thanks in advance!!

Wile_E_Coyote · March 3, 2018, 6:22pm

gwatts · March 3, 2018, 9:08pm

I run this automatically on 100’s of files - so I don’t want to run valgrind or the debugger on everything. Normally, when there is an exception, there is a way to get a crash dump. Is that not the case here? Am I really stuck with valgrind or the debugger?

wlav · March 3, 2018, 11:04pm

Running under gdb will be faster than valgrind, but if that’s also too inconvenient b/c of the number of files, you can try the following code: trace_except2.cxx (4.1 KB)

Build it into a shared library, then preload it. Run it, and use addr2line to get the line from the address + file name that is printed.

gwatts · March 4, 2018, 9:04am

Thanks. Ok, the problem is I have about 100 different processes running. And this is a generic problem - so I was hoping for a generic solution. Something I could look at in a log file. For example, a crash dump from root when the exception is thrown. My ATLAS code gives a crash dump… It seems like ROOT must be doing some extra work to suppress this sort of information.

For this I’ll use one of these techniques to solve my current problem. Hopefully there is something that can be done to solve the generic problem.

wlav · March 4, 2018, 3:36pm

No conspiracy, just how C++ exceptions work: the exception in your case is uncaught until it reaches the ROOT interpreter. At that point, the stack has been unwound and no further information about it is available.

All the tools proposed print the trace at the point of throwing. An alternative, and more general in the sense that you want it I think, is to not print but store the stack trace at point of throwing. Then only print it if the exception reaches the interpreter uncaught. Of course, code that uses lots of exceptions will be heavily penalized that way.

Wile_E_Coyote · March 4, 2018, 3:38pm

@axel Maybe you could incorporate Wim’s solution into CLING?

Axel · March 6, 2018, 10:31am

Yes that’s a known issue; e.g. C# (?) passes the stack trace with exceptions.

Forced pre-loading of ABI-specific exception handling code is fairly evil… but maybe I can inject this only if the users sets a certain flag. I have created https://sft.its.cern.ch/jira/browse/ROOT-9296 to keep track of this. Thanks for the code, Wim!

gwatts · March 6, 2018, 1:01pm

Thanks!!

C# does add the stack trace to the exception, indeed. But the C# behavior hadn’t entered my mind at all (I swear! I swear! ;-)) Most C++ code I run on Linux generates a stack trace on an uncaught exception. I guess the deal here is that the exception is actually caught by ROOT, and thus isn’t really a “crash” - and so the stack trace is lost (the catching was what I meant by ROOT doing extra work).

I have since fixed the source fo the bug, but it took some time to isolate this crash from my framework. And, indeed, once I’d isolated it, had I seen the line of code that triggered the exception I probably would have had the bug in a few minutes.

Axel · March 6, 2018, 1:24pm

Can you point me to a C++ program that does it? ATHENA I guess, thanks to Wim’s code - but otherwise they emit a mere

terminate called after throwing an instance of '...'
Aborted

But I totally understand how useful it is; we’ll see how to add it!

sbinet · March 6, 2018, 1:53pm

actually, IIRC (and if this hasn’t changed since I left Atlas), the stack trace happy Atlassians used to get was provided by good ol’ Seal from LCG:

system · March 20, 2018, 1:53pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.