Weird TChain bug

Hi,

I attached a code which compiles but it crashes when I execute it. If you look at the code, you will see there are 2 blocks of code. The 2 blocks are identical. However, when I run the code, the first block works while the second block crashes.

I think the problem comes from the fact that I rename my TChain objects. Why?

I’m using root 6.12.06

bug2.cc (865 Bytes)

Hi @Adrian,

What is exactly the error that you see? Can you share your file so I can reproduce?

Cheers,

Enric

Hi,

The root file is too big to be posted on the forum. Here a link where you can download it: https://owncloud.lal.in2p3.fr/index.php/s/yIfRyDqpuf3VaoH

Here is the error message I get:

start first block
end first block

start second block

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f1d34ab0dbc in waitpid () from /lib64/libc.so.6
#1  0x00007f1d34a33cc2 in do_system () from /lib64/libc.so.6
#2  0x00007f1d38de9dff in TUnixSystem::StackTrace (this=0x12cb980) at ../core/unix/src/TUnixSystem.cxx:2412
#3  0x00007f1d38dec52c in TUnixSystem::DispatchSignals (this=0x12cb980, sig=kSigSegmentationViolation) at ../core/unix/src/TUnixSystem.cxx:3643
#4  <signal handler called>
#5  0xfffffffffffffff8 in ?? ()
#6  0x00007f1d38d6078d in TList::FindObject (this=<optimized out>, obj=0x2fcb9a0) at ../core/cont/src/TList.cxx:614
#7  0x00007f1d38d5e07d in THashList::RecursiveRemove (this=0x12e8ce0, obj=0x2fcb9a0) at ../core/cont/src/THashList.cxx:328
#8  0x00007f1d38c54343 in TROOT::RecursiveRemove (this=0x7f1d390eee00 <ROOT::Internal::GetROOT1()::alloc>, obj=<optimized out>) at ../core/base/src/TROOT.cxx:2440
#9  0x00007f1d38cfd2a2 in CallRecursiveRemoveIfNeeded (obj=...) at include/TROOT.h:387
#10 TNamed::~TNamed (this=0x2fcb9a0, __in_chrg=<optimized out>) at ../core/base/src/TNamed.cxx:45
#11 0x00007f1d370e98a9 in TTree::~TTree (this=0x2fcb9a0, __in_chrg=<optimized out>) at ../tree/tree/src/TTree.cxx:958
#12 0x00007f1d38d6530f in TList::Delete (this=this
entry=0x2d03030, option=<optimized out>, option
entry=0x7f1d386742d9 "") at ../core/cont/src/TList.cxx:534
#13 0x00007f1d38d5e68c in THashList::Delete (this=0x2d03030, option=<optimized out>) at ../core/cont/src/THashList.cxx:215
#14 0x00007f1d38504649 in TDirectoryFile::Close (this=0x2d65af0, option=<optimized out>) at ../io/io/src/TDirectoryFile.cxx:577
#15 0x00007f1d3851e50c in TFile::Close (this=this
entry=0x2d65af0, option=option
entry=0x7f1d386742d9 "") at ../io/io/src/TFile.cxx:953
#16 0x00007f1d3851e8f1 in TFile::~TFile (this=0x2d65af0, __in_chrg=<optimized out>) at ../io/io/src/TFile.cxx:547
#17 0x00007f1d3851eb59 in TFile::~TFile (this=0x2d65af0, __in_chrg=<optimized out>) at ../io/io/src/TFile.cxx:584
#18 0x00007f1d370b960c in TChain::~TChain (this=0x2fbc270, __in_chrg=<optimized out>) at ../tree/tree/src/TChain.cxx:199
#19 0x00007f1d370b9799 in TChain::~TChain (this=0x2fbc270, __in_chrg=<optimized out>) at ../tree/tree/src/TChain.cxx:215
#20 0x000000000040125c in main ()
===========================================================

Your “link” shows hundreds of files but no “.root”.

oops! sorry, wrong link. Here is the correct link: https://owncloud.lal.in2p3.fr/index.php/s/NsckgmgsRLfNdsQ

Hi @Adrian,

It looks like the two deletes are trying to remove the same object from an internal list, and the second time it fails. This looks like a bug, but I will let @pcanal comment on it.

If you allocate both TChain on the stack instead of on the heap, the program finishes with no error.

Cheers,

Enric

The problem does NOT appear in ROOT 5.34 (I tried a fairly new “v5-34-00-patches” branch).
The problem is visible in ROOT 6, however.
I tried 6.12/06 and 6.13/02:

`root-config --cxx --cflags` -g bug2.cc `root-config --libs`
valgrind --tool=memcheck --suppressions=`root-config --etcdir`/valgrind-root.supp ./a.out

but the valgrind step produces 170 kB long list of ROOT related errors / warnings (a usual problem with recent ROOT 6 about which I complained many times) so it’s not really possible to “debug” it.

But I maybe found a “brutal fix”:

#include "TApplication.h"
#include "TChain.h"
#include <iostream>

int main (int /*argc*/, char** /*argv*/) {
  TApplication a("a", 0, 0); // just to make sure that the autoloading of ROOT libraries works

  TChain *TC;
  // ...

Right!

I forgot to mention that this problem appeared with recent versions of ROOT.

Thank you for your help

I tried the “brutal fix” of Wile_E_Coyote. It does not help.

May I expect a bug fix for future versions of ROOT? Or should I try to find some work-around? I think I can make it work if I don’t rename the TChain objects.

Thank you

Try with:

TApplication *a = new TApplication("a", 0, 0);

Unfortunately it does not help. My code crashes, just like before.

Just to make it clear … do you mean that the “bug2.cc”, modified and compiled as shown in one of my previous posts, still crashes?

Yes,

when I execute ./a.out, it crashes. I used the following compilation command:
`root-config --cxx --cflags` -g bug2.cc `root-config --libs`

I guess @pcanal will need exact versions of your operating system and compiler.

$ uname -a
Linux olserver141.virgo.infn.it 3.10.0-693.21.1.el7.x86_64 #1 SMP Wed Mar 7 19:03:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16)

Thank you

I can confirm that my “brutal fix” (utilizing TApplication) seems to work with ROOT 6.13/02 but it still crashes with ROOT 6.12/06.

While I am attempting to reproduce this problem, a random question: why (what is the advantage of) renaming the TChain?

Indeed the problem is linked to changing the name.

TChains objects are recorded in 3 hashlist, include the LIstOfCleanups. Those hash list are basing their lookup on the name. However TChain is missing the overload of SetName that would inform those list of the changes.
When the TChain then attempt to deregister itself from the list, the lookup fails (hash is different) and hence subsequent use of the list (in particular the ListOfCleanups) leads to random behaviors. This shall be fixed shortly.

If the ‘easy’ workaround (don’t call SetName on the chain) is not practical you can use:

gROOT->GetListOfCleansup()->Remove(TC);
TC->SetName( newname );
gROOT->GetListOfCleanups()->Add(TC);

Cheers,
Philippe.

See https://github.com/root-project/root/pull/1983

Thank you Philippe. This is very helpful.

I tested your workaround and it fixed the problem. I will also install the latest root and test your bug fix.

Cheers