Crash in TGButton::HandleButton

ROOT Version: 6.24/02
Platform: MacOS
Compiler: clang 12.0.0

I’m writing a GUI using a class that inherits from TGMainFrame and creates different window layouts depending on the flow of the program. The transition from the first window to the second one works without any problems, but when the transition from the second window to the third ends I get the following crash:

[/usr/lib/system/libsystem_platform.dylib] _sigtramp (no debug info)

[<unknown binary>] (no debug info)

[/opt/local/libexec/root6/lib/root/] TGButton::HandleButton(Event_t*) (no debug info)

[/opt/local/libexec/root6/lib/root/] TGFrame::HandleEvent(Event_t*) (no debug info)

[/opt/local/libexec/root6/lib/root/] TGClient::HandleEvent(Event_t*) (no debug info)

[/opt/local/libexec/root6/lib/root/] TGClient::ProcessOneEvent() (no debug info)

[/opt/local/libexec/root6/lib/root/] TGInputHandler::Notify() (no debug info)

[/opt/local/libexec/root6/lib/root/] TUnixSystem::DispatchOneEvent(bool) (no debug info)

[/opt/local/libexec/root6/lib/root/] TSystem::InnerLoop() (no debug info)

[/opt/local/libexec/root6/lib/root/] TSystem::Run() (no debug info)

[/opt/local/libexec/root6/lib/root/] TApplication::Run(bool) (no debug info)

[/opt/local/libexec/root6/lib/root/] TRint::Run(bool) (no debug info)

I can’t easily create a simple example of this, but maybe someone has an idea what could cause this? I had a careful look through the code to make sure that all elements of the second window are deleted and I do call the TGMainFrame::RemoveAll function to remove all elements from the old window, but maybe there is some other mistake that I’m overlooking? I’ve checked and removing the one explicit call to DispatchOneEvent doesn’t change anything.

I do see some hints of elements from the second screen on the third screen (the white text entry box from a TGNumberEntry shows at the edge of the window), even though these have been deleted but I’m not sure if this might not be because of the crash instead of it causing the crash …

I would suggest that you build ROOT in debug mode. It will output more information on where it crashes.
Then run a debugger like gdb or even better, an interactive debugger like QtCreator.

I tried that on a linux machine (where I get the same error when running it normally), and running it through gdb gives an error double free or corruption (!prev) before I even hit that spot. Since this is just a line of code that deletes tabs that have been created previously I simply commented it out and re-ran it in gdb. This makes the crash go away so I can’t debug it with gdb the normal way.

Following a tip I found I instead started the program and then attached gdb to the running process. This gives me the following stack trace:

#0  0x00007f23f63afeb1 in TGButton::EmitSignals (this=0x9fc8e70, was=false) at /opt/cern/root/root-6.24.02/gui/gui/src/TGButton.cxx:351
#1  0x00007f23f63af82d in TGButton::SetState (this=0x9fc8e70, state=kButtonUp, emit=true) at /opt/cern/root/root-6.24.02/gui/gui/src/TGButton.cxx:217
#2  0x00007f23f63afc76 in TGButton::HandleButton (this=0x9fc8e70, event=0x7fffa73c3020) at /opt/cern/root/root-6.24.02/gui/gui/src/TGButton.cxx:316
#3  0x00007f23f641e402 in TGFrame::HandleEvent (this=0x9fc8e70, event=0x7fffa73c3020) at /opt/cern/root/root-6.24.02/gui/gui/src/TGFrame.cxx:521
#4  0x00007f23f63d3421 in TGClient::HandleEvent (this=0x358d250, event=0x7fffa73c3020) at /opt/cern/root/root-6.24.02/gui/gui/src/TGClient.cxx:845
#5  0x00007f23f63d2d2d in TGClient::ProcessOneEvent (this=0x358d250) at /opt/cern/root/root-6.24.02/gui/gui/src/TGClient.cxx:655
#6  0x00007f23f63d2ec0 in TGClient::HandleInput (this=0x358d250) at /opt/cern/root/root-6.24.02/gui/gui/src/TGClient.cxx:702
#7  0x00007f23f63d143a in TGInputHandler::Notify (this=0x35a85a0) at /opt/cern/root/root-6.24.02/gui/gui/src/TGClient.cxx:116
#8  0x00007f23f5cd272f in TUnixSystem::DispatchOneEvent (this=0x1473df0, pendingOnly=false) at /opt/cern/root/root-6.24.02/core/unix/src/TUnixSystem.cxx:1067
#9  0x00007f23f5bc1459 in TSystem::InnerLoop (this=0x1473df0) at /opt/cern/root/root-6.24.02/core/base/src/TSystem.cxx:404
#10 0x00007f23f5bc1207 in TSystem::Run (this=0x1473df0) at /opt/cern/root/root-6.24.02/core/base/src/TSystem.cxx:354
#11 0x00007f23f5b4ec78 in TApplication::Run (this=0x254c100, retrn=true) at /opt/cern/root/root-6.24.02/core/base/src/TApplication.cxx:1623
#12 0x00007f23f2da75fa in TRint::Run (this=0x254c100, retrn=true) at /opt/cern/root/root-6.24.02/core/rint/src/TRint.cxx:463
#13 0x00000000004021ea in main (argc=3, argv=0x7fffa73c54f8) at src/grsisort.cxx:81

Not sure what about the line
if ((was != now) && IsToggleButton()) Toggled(!now); // emit Toggled = was != now
triggers a crash, but at least I got a starting point for debugging this.

Could it also be a multithreading problem? The button you click emits a signal, which is connected to a slot maybe, that removes the button itself?

valgrind --suppressions=your_root_src_path/etc/valgrind-root.supp root.exe yourscript.cpp

You can also try with
–tool=helgrind --suppressions=/opt/root_src/etc/helgrind-root.supp

to see if you find something suspicious.

Otherwise, try to create a minimal reproducer. Or explain clearly how you remove the current frame elements. Do you call SetDeepCleanup in the constructor? Sometimes closing a window deletes the elements recursively, you do not need to call the destructor explicitly with delete.

Another option is that you try out the script ROOT: tutorials/gui/guitest.C File Reference
This one has a button to create and destroy new menus, which you might get inspiration from.

I’m not running multiple threads, but I might be doing something wrong when I try to delete the elements.

The flow of the program is:

  1. Create first window using myClass::AddFrame (inherited from TGMainFrame) with an array of TGLabel and TGComboBox and one TGButton (all created on heap here)
  2. When the button is clicked call myClass::RemoveAll (also inherited from TGMainFrame), and delete all elements created on heap.
  3. Create second window in the same way the first one was created, using different elements among them a TGTab holding a TGTab.
  4. When this window is dismissed do the same thing as before, i.e. call myClass::RemoveAll, and then delete all elements created on heap (except for the tabs that were created as this causes a double-free for some reason)
  5. The third window is created with tabs and buttons, this triggers the crash

I tried checking which Button might have caused the crash using gdb, but I can’t really make sense of the output (I thought the name would be helpful, but it seems to be nonsense):

(gdb) p this->GetName()
Cannot access memory at address 0x646172676a05
(gdb) p this
$2 = (TGButton * const) 0x937cf90
(gdb) p this->fName
$3 = {_vptr.TString = 0xab112a0, fRep = {{fLong = {fCap = 112361616, fSize = 0, fData = 0xaaea540 "@Z\261\n"}, fShort = {fSize = 144 '\220', fData = "\200\262\006\000\000\000\000@\245\256\n\000\000\000"}, fRaw = {fWords = {112361616, 0, 
          179217728, 0}}}}, static kNPOS = -1, static fgIsA = {_M_b = {_M_p = 0x1890030}}}
(gdb) p this->fName.Data()
$4 = 0x937cfc9 "\200\262\006"
(gdb) p *this
$5 = {<TGFrame> = {<TGWindow> = {<TGObject> = {<TObject> = {_vptr.TObject = 0x64617267694d, fUniqueID = 0, fBits = 0, static fgDtorOnly = 0, static fgObjectStat = false, static fgIsA = {_M_b = {_M_p = 0x1737b40}}}, fId = 0, 
        fClient = 0x0, static fgIsA = {_M_b = {_M_p = 0x69dcad0}}}, fParent = 0x0, fNeedRedraw = 16, fName = {_vptr.TString = 0xab112a0, fRep = {{fLong = {fCap = 112361616, fSize = 0, fData = 0xaaea540 "@Z\261\n"}, fShort = {
              fSize = 144 '\220', fData = "\200\262\006\000\000\000\000@\245\256\n\000\000\000"}, fRaw = {fWords = {112361616, 0, 179217728, 0}}}}, static kNPOS = -1, static fgIsA = {_M_b = {_M_p = 0x1890030}}}, static fgCounter = 6648, 
      fEditDisabled = 179416336, static fgIsA = {_M_b = {_M_p = 0x69dc110}}}, <TQObject> = {_vptr.TQObject = 0xab15a40, fListOfSignals = 0x937d010, fListOfConnections = 0xab16bb0, fSignalsBlocked = 96, 
      static fgAllSignalsBlocked = false, static fgIsA = {_M_b = {_M_p = 0x24e1ec0}}}, fX = 0, fY = 179312048, fWidth = 0, fHeight = 385, fMinWidth = 0, fMinHeight = 0, fMaxWidth = 0, fMaxHeight = 8384528, fBorderWidth = 0, 
    fOptions = 1, fBackground = 179129680, fEventMask = 179129704, fDNDState = 0, fFE = 0xaad4d68, static fgInit = true, static fgDefaultFrameBackground = 12632256, static fgDefaultSelectedBackground = 128, 
    static fgWhitePixel = 16777215, static fgBlackPixel = 0, static fgBlackGC = 0x24c1440, static fgWhiteGC = 0x24c11a0, static fgHilightGC = 0x29a9500, static fgShadowGC = 0x29f75b0, static fgBckgndGC = 0x29f7910, 
    static fgLastClick = 408662520, static fgLastButton = 1, static fgDbx = 549, static fgDby = 597, static fgDbw = 12588770, static fgUserColor = 0, static fgIsA = {_M_b = {_M_p = 0x3218000}}}, <TGWidget> = {
    _vptr.TGWidget = 0x1300000013, fWidgetId = 1, fWidgetFlags = 0, fMsgWindow = 0xab15ba0, fCommand = {_vptr.TString = 0xab15bb8, fRep = {{fLong = {fCap = 179395512, fSize = 0, fData = 0xaaf6ab0 "0\a\257\n"}, fShort = {
            fSize = 184 '\270', fData = "[\261\n\000\000\000\000\260j\257\n\000\000\000"}, fRaw = {fWords = {179395512, 0, 179268272, 0}}}}, static kNPOS = -1, static fgIsA = {_M_b = {_M_p = 0x1890030}}}, static fgIsA = {_M_b = {
        _M_p = 0x6a2bb90}}}, fTWidth = 179268280, fTHeight = 0, fState = 179268280, fStayDown = false, fNormGC = 179395456, fUserData = 0x100000002, fTip = 0x0, fGroup = 0x40ce1042d3680bde, fBgndColor = 4741691291151532530, 
  fHighColor = 179415600, fStyle = 179415752, static fgDefaultGC = 0x29f6010, static fgHibckgndGC = 0x0, static fgReleaseBtn = 0, static fgIsA = {_M_b = {_M_p = 0x6afa620}}}

I also tried putting a cout statement in the TGButton::EmitSignals function to print the name of the button, but that simply causes the crash to happen when accessing the name:

#10 0x00007f235fc96b7e in tobuf (buf=<error reading variable>, x=10 '\n') at /opt/cern/root/root-6.24.02/core/base/inc/Bytes.h:63

It also gives some new error messages

input_line_99:2:2: error: source file is not valid UTF-8

I found one stupid error with val grind (using bin indices to fill a vector without correcting the counting from 1 vs counting from 0 issue), but that didn’t solve the issue itself.
I fell back on using cout statements to print the address of all buttons created and also put one in the TGButton::EmitSignals function. This showed that the button that creates the crash is one that has been deleted after the second layout was done. The button is part of a TGHButtonGroup and get’s deleted after it was clicked. I don’t know what function tries to call this button after it was deleted, but I tried simply commenting out the part of the code where the buttons get deleted, but that leads to a new error:

Error in <RootX11ErrorHandler>: BadWindow (invalid Window parameter) (XID: 194163456, XREQ: 25)

I think I will try a different route and instead of having a class that inherits from TGMainFrame, make one that has multiple members that are TGMainFrames.

You cannot delete the signal emitter while handling the slot, since the slot method will return to the signal emitter, and if it has been deleted in the meanwhile you get a crash…

That makes sense. But since not deleting them also leads to an error, when and how is the appropriate way of deleting the signal emitter? I’ve looked in the guitest.C example and the WritingGUI.html page, but I couldn’t find anything related to that.

The way I’ve been handling this so far is that I have a TGHButtonGroup that gets connected to a function. That function handles everything that should happen when a button is clicked. To me that did include removing all elements of the window and creating a new layout, but since I shouldn’t do that, how do I trigger deleting all the old elements?

I depends if (and how) you use the TGFrame::SetCleanup() method, and how you delete the widgets if you do the cleanup yourself…

I’m not using the TGFrame::SetCleanUp() method at all, what does it do? It isn’t mentioned anywhere in the WritingGUI.html page and the documentation says it’s dangerous to use? Also the way it’s describe it only deletes things in the destructor, so if I wanted to change the layout of the window without destroying it, calling it wouldn’t affect anything or do I misunderstand how this works?

Let’s say I have a class that inherits from TGMainFrame, and when you create this class is just shows a window with the button “start”. When you click the button it changes the window to instead show a histogram (or whatever else you would want). How would I delete that button? I can’t delete it in the function that gets called by clicking the button as you said, nor can I delete it in any of the functions I would call to create the new layout as those would be called from the connected function. And since I’m not destroying the main class, the SetCleanup wouldn’t affect anything, right?

Right, in this case SetCleanup() would be useless. Could you provide a simple script reproducing the problem, so we have a base of reflection where we can start from?

I’ve tried reproducing the problem in a simple script but it just doesn’t happen in that case. I can delete the button without issues, but that might be because I have only one single button per layout, so the buttons always get created at the exact same memory address. I tried adding a TGLabel in between to force the button to be created at a different memory address, but that doesn’t seem to work.

Okay, I can change that by deleting the new button only after creating the new one (duh!). So now I have a simple script to recreate the problem.

guitest.C (1.9 KB)

Here is a possible solution
guitest.C (2.3 KB)

That’s an interesting solution. I implemented it in my code and the issue went away. Thanks for the help!

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.