I have been working on a data acquisition program build on ROOT where in a tight loop data is read and histograms on several canvases are updated. To make ROOT responsive during this time I added a call to TSystem::ProcessEvents() in the loop. This seems to work fine, but now I’m experiencing intermittent segfaults related to the handling of mouse events in TCanvas/TPad.
I don’t have a minimal reproducing example since the program I’m running is somewhat extensive, but I have done some debugging and will describe what I think is going on.
The following crash happens randomly when I move the mouse around over the canvases (the crash is reproducible, if I continuously move the mouse around the crash happens within half a minute or so, if the mouse is not touched, the program finishes without problems):
==30404== Invalid read of size 8
==30404== at 0x1AEB78D3: THistPainter::ExecuteEvent(int, int, int) (in /usr/local/share/root/6.02.08/lib/libHistPainter.so)
==30404== by 0x6F44FD7: TCanvas::EnterLeave(TPad*, TObject*) (in /usr/local/share/root/6.02.08/lib/libGpad.so)
==30404== by 0x6F4549F: TCanvas::HandleInput(EEventType, int, int) (in /usr/local/share/root/6.02.08/lib/libGpad.so)
==30404== by 0x526B32A: TRootCanvas::HandleContainerMotion(Event_t*) (in /usr/local/share/root/6.02.08/lib/libGui.so)
==30404== by 0x527B610: TGFrame::HandleEvent(Event_t*) (in /usr/local/share/root/6.02.08/lib/libGui.so)
==30404== by 0x5322F87: TGClient::HandleEvent(Event_t*) (in /usr/local/share/root/6.02.08/lib/libGui.so)
==30404== by 0x532323C: TGClient::ProcessOneEvent() (in /usr/local/share/root/6.02.08/lib/libGui.so)
==30404== by 0x532329C: TGClient::HandleInput() (in /usr/local/share/root/6.02.08/lib/libGui.so)
==30404== by 0x5921E67: TUnixSystem::DispatchOneEvent(bool) (in /usr/local/share/root/6.02.08/lib/libCore.so)
==30404== by 0x58AC07B: TSystem::ProcessEvents() (in /usr/local/share/root/6.02.08/lib/libCore.so)
==30404== by 0x428AAB: TEventStream::Go(long, char const*) (TEventStream.cpp:30)
==30404== by 0x4060F7B: ???
==30404== Address 0x0 is not stack'd, malloc'd or (recently) free'd
(Running ROOT 6.02/08 on openSUSE 13.2 (linux 3.16.7-21-desktop))
This is the offending line:
THistPainter.cxx:3240 if (!gPad->IsEditable()) return;
The segfault occurs when gPad is zero in THistPainter::ExecuteEvent(). It is set to zero in TCanvas::EnterLeave(TPad *prevSelPad, TObject *prevSelObj), where gPad is set to prevSelPad when sending a mouse event to the object prevSelObj. EnterLeave() is called from TCanvas::HandleInput() with arguments TCanvas::fSelected and TCanvas::fSelectedPad. However, it seems there is no guarantee that fSelectedPad is actually nonzero when fSelected is set: in TCanvas::SetSelected(), fSelected is set without setting fSelectedPad.
Here is another stack trace showing that the prevSelPad argument of TCanvas::EnterLeave is zero (note: I used a debug build of a somewhat older ROOT 5.34/02 here, since I happened to have that installed).
#0 0x00007fffebd51ad0 in THistPainter::ExecuteEvent (this=0x1302680, event=53, px=893, py=146)
#1 0x00007ffff4d3bc98 in TH1::ExecuteEvent (this=0x121c2e0, event=53, px=893, py=146)
#2 0x00007ffff4101a3c in TCanvas::EnterLeave (this=0x10c5ef0, prevSelPad=0x0, prevSelObj=
0x121c2e0) at /usr/local/share/root/5.34.02-debug/graf2d/gpad/src/TCanvas.cxx:980
#3 0x00007ffff41022b7 in TCanvas::HandleInput (this=0x10c5ef0, event=kMouseMotion, px=893, py=146)
#4 0x00007ffff740a095 in TRootCanvas::HandleContainerMotion (this=0x10c6290, event=0x7ffffffdc260)
The segfault can occur at least for histograms with the kMouseMotion and kMouseLeave events, but looking in TCanvas::EnterLeave() there are more mouse events handled in a similar way (e.g. kButton1ShiftMotion). It seems that the condition at TCanvas.cxx::1161 often prevents the segfault, but not always.
I have only seen the crash while running the data handling loop, so it could be related to my code or maybe it just triggers an otherwise rare code path. It would appear that adding checks in HandleInput() to not call ExecuteEvent()s when gPad would be set to zero resolves the issue, or modifying TCanvas::SetSelected to make sure fSelectedPad is set whenever fSelected is?