Segfault when processing large files with a TChain

Hi everyone,

I have one tool that produces ntuples wildly varying in size (~10 MB to 10 GB). These are then processed with another tool, which reads input files into a TChain. If I pass a single file of 2.8 GB, it crashes right in the beginning with this error message:

added output_RunRandS_3499912_37.root
processing all 38967226 events.
Error in <TFile::ReadBuffer>: error reading all requested bytes from file output_RunRandS_3499912_37.root, got 253276 of 304624
Error in <TBranchElement::GetBasket>: File: output_RunRandS_3499912_37.root at byte:258, branch:JetPt, entry:761, badread=1, nerrors=1, basketnumber=1

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
    import os
#0  0x0000003e7e4ac89e in waitpid () from /lib64/libc.so.6
#1  0x0000003e7e43e4e9 in do_system () from /lib64/libc.so.6
#2  0x00007fa1de96febd in TUnixSystem::StackTrace() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so
#3  0x00007fa1de972624 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so
#4  <signal handler called>
#5  0x0000000000431cef in TTbarPrediction::ReadJetsSimplified() ()
#6  0x00000000004372bd in TTbarPrediction::Process(long long) ()
#7  0x00007fa1dc46d360 in TTreePlayer::Process(TSelector*, char const*, long long, long long) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#8  0x000000000041336b in main ()
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x0000000000431cef in TTbarPrediction::ReadJetsSimplified() ()
#6  0x00000000004372bd in TTbarPrediction::Process(long long) ()
#7  0x00007fa1dc46d360 in TTreePlayer::Process(TSelector*, char const*, long long, long long) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#8  0x000000000041336b in main ()
===========================================================

The next smaller file I have is 2.3 GB, there the tool runs fine.

However, I can open the file and draw JetPt without any troubles on the ROOT interpreter:

% root output_RunRandS_3499912_37.root
root [0] 
Attaching file output_RunRandS_3499912_37.root as _file0...
(TFile *) 0x1ba8ea0
root [1] PredictionTree->Draw("JetPt")
Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1

The files in question are

-rw-r--r-- 1 jndrf af-atlas 2289399998 18. Dez 17:03 output_RunRandS_3499912_36.root #working
-rw-r--r-- 1 jndrf af-atlas 2910035906 18. Dez 17:07 output_RunRandS_3499912_37.root #segfaulting

Why is this and how can I fix this problem? A workaround I can think of is to rearrange everything in files of 2GB, but is there a tool for that? hadd is afaik not able to split the output files.

I have also opened all files in pyROOT and IsZombie() returned false fore every one of them.

In case it is relevant, the spoiler contains the part where I create the TChain. Later, I call TChain::Process to run my TSelector-derived analyzer over the chain.

TChain creation
Runner::Runner(std::vector<std::string> inlist, bool verbose /*true*/, std::string tn /*"EventTree"*/)
{
   if (inlist.size() == 0) {
      throw std::invalid_argument("no input files provided");
   }

   chain = std::make_unique<TChain>(tn.c_str());

   for (auto it : inlist) {
      chain->Add(it.c_str());
      std::cout << "added " << it << std::endl;
   }

   if (not verbose) {
      gErrorIgnoreLevel = kWarning;
   }
}

With an even larger file of 9GB I get a different backtrace, but opening the file on the root interpreter still works.

backtrace
added output_RunRandS_3499912_48.root
processing all 114710146 events.

 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
    import os
#0  0x0000003e7e4ac89e in waitpid () from /lib64/libc.so.6
#1  0x0000003e7e43e4e9 in do_system () from /lib64/libc.so.6
#2  0x00007f19b9355ebd in TUnixSystem::StackTrace() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so
#3  0x00007f19b9358624 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so
#4  <signal handler called>
#5  0x0000003e7e489a40 in memcpy () from /lib64/libc.so.6
#6  0x00007f19b895bcad in TFile::ReadBuffers(char*, long long*, int*, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so
#7  0x00007f19b896e55f in TFileCacheRead::ReadBufferExtNormal(char*, long long, int, int&) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so
#8  0x00007f19b896df0a in TFileCacheRead::ReadBuffer(char*, long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so
#9  0x00007f19b71f7fba in TTreeCache::ReadBufferNormal(char*, long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#10 0x00007f19b71766c3 in TBasket::ReadBasketBuffers(long long, int, TFile*) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#11 0x00007f19b717c6ad in TBranch::GetBasket(int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#12 0x00007f19b717cd82 in TBranch::GetEntry(long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#13 0x00007f19b7192803 in TBranchElement::GetEntry(long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#14 0x00007f19b6e90d26 in ROOT::Internal::TTreeReaderValueBase::ProxyRead() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#15 0x00007f19b6e91c19 in ROOT::Internal::TTreeReaderValueBase::GetAddress() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#16 0x0000000000431cdf in TTbarPrediction::ReadJetsSimplified() ()
#17 0x00000000004372bd in TTbarPrediction::Process(long long) ()
#18 0x00007f19b6e53360 in TTreePlayer::Process(TSelector*, char const*, long long, long long) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#19 0x000000000041336b in main ()
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x0000003e7e489a40 in memcpy () from /lib64/libc.so.6
#6  0x00007f19b895bcad in TFile::ReadBuffers(char*, long long*, int*, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so
#7  0x00007f19b896e55f in TFileCacheRead::ReadBufferExtNormal(char*, long long, int, int&) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so
#8  0x00007f19b896df0a in TFileCacheRead::ReadBuffer(char*, long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so
#9  0x00007f19b71f7fba in TTreeCache::ReadBufferNormal(char*, long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#10 0x00007f19b71766c3 in TBasket::ReadBasketBuffers(long long, int, TFile*) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#11 0x00007f19b717c6ad in TBranch::GetBasket(int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#12 0x00007f19b717cd82 in TBranch::GetEntry(long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#13 0x00007f19b7192803 in TBranchElement::GetEntry(long long, int) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so
#14 0x00007f19b6e90d26 in ROOT::Internal::TTreeReaderValueBase::ProxyRead() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#15 0x00007f19b6e91c19 in ROOT::Internal::TTreeReaderValueBase::GetAddress() () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#16 0x0000000000431cdf in TTbarPrediction::ReadJetsSimplified() ()
#17 0x00000000004372bd in TTbarPrediction::Process(long long) ()
#18 0x00007f19b6e53360 in TTreePlayer::Process(TSelector*, char const*, long long, long long) () from /cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so
#19 0x000000000041336b in main ()
===========================================================


*** glibc detected *** ../RunTTbarPrediction: munmap_chunk(): invalid pointer: 0x00007f1985aa0010 ***
======= Backtrace: =========
/lib64/libc.so.6[0x3e7e475e5e]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so(_ZN7TBufferD1Ev+0x26)[0x7f19b9214ea6]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so(_ZN11TBufferFileD0Ev+0x9)[0x7f19b890eaa9]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN7TBranchD1Ev+0x270)[0x7f19b717f6a0]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN14TBranchElementD0Ev+0x9)[0x7f19b71900d9]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so(_ZN9TObjArray6DeleteEPKc+0x64)[0x7f19b92c42b4]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN5TTreeD1Ev+0x138)[0x7f19b71e3db8]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN5TTreeD0Ev+0x9)[0x7f19b71e4159]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so(_ZN9THashList6DeleteEPKc+0x295)[0x7f19b92b3895]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so(_ZN5TROOT20EndOfProcessCleanupsEv+0x99)[0x7f19b91814e9]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libCore.so(_ZN11TUnixSystem15DispatchSignalsE8ESignals+0x222)[0x7f19b9358742]
/lib64/libpthread.so.0[0x3e7ec0f7e0]
/lib64/libc.so.6(memcpy+0x320)[0x3e7e489a40]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so(_ZN5TFile11ReadBuffersEPcPxPii+0x36d)[0x7f19b895bcad]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so(_ZN14TFileCacheRead19ReadBufferExtNormalEPcxiRi+0x1ff)[0x7f19b896e55f]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libRIO.so(_ZN14TFileCacheRead10ReadBufferEPcxi+0xaa)[0x7f19b896df0a]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN10TTreeCache16ReadBufferNormalEPcxi+0x4a)[0x7f19b71f7fba]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN7TBasket17ReadBasketBuffersExiP5TFile+0x233)[0x7f19b71766c3]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN7TBranch9GetBasketEi+0x2ad)[0x7f19b717c6ad]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN7TBranch8GetEntryExi+0x342)[0x7f19b717cd82]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTree.so(_ZN14TBranchElement8GetEntryExi+0x1f3)[0x7f19b7192803]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so(_ZN4ROOT8Internal20TTreeReaderValueBase9ProxyReadEv+0x4e6)[0x7f19b6e90d26]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so(_ZN4ROOT8Internal20TTreeReaderValueBase10GetAddressEv+0x9)[0x7f19b6e91c19]
../RunTTbarPrediction(_ZN15TTbarPrediction18ReadJetsSimplifiedEv+0x4f)[0x431cdf]
../RunTTbarPrediction(_ZN15TTbarPrediction7ProcessEx+0x9d)[0x4372bd]
/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/lib/libTreePlayer.so(_ZN11TTreePlayer7ProcessEP9TSelectorPKcxx+0x360)[0x7f19b6e53360]
../RunTTbarPrediction(main+0xc9b)[0x41336b]
/lib64/libc.so.6(__libc_start_main+0x100)[0x3e7e41ed20]
../RunTTbarPrediction[0x413c71]
======= Memory map: ========
00400000-00445000 r-xp 00000000 00:14 4868557556                         /nfs/dust/atlas/user/neundorf/RnS/improvements/RunTTbarPrediction
00645000-00646000 r--p 00045000 00:14 4868557556                         /nfs/dust/atlas/user/neundorf/RnS/improvements/RunTTbarPrediction
00646000-00647000 rw-p 00046000 00:14 4868557556                         /nfs/dust/atlas/user/neundorf/RnS/improvements/RunTTbarPrediction
00919000-054de000 rw-p 00000000 00:00 0                                  [heap]
34f9e00000-34f9e03000 r-xp 00000000 08:05 914288                         /lib64/libcom_err.so.2.1
34f9e03000-34fa002000 ---p 00003000 08:05 914288                         /lib64/libcom_err.so.2.1
34fa002000-34fa003000 r--p 00002000 08:05 914288                         /lib64/libcom_err.so.2.1
34fa003000-34fa004000 rw-p 00003000 08:05 914288                         /lib64/libcom_err.so.2.1
34fa200000-34fa2dc000 r-xp 00000000 08:05 914342                         /lib64/libkrb5.so.3.3
34fa2dc000-34fa4db000 ---p 000dc000 08:05 914342                         /lib64/libkrb5.so.3.3
34fa4db000-34fa4e5000 r--p 000db000 08:05 914342                         /lib64/libkrb5.so.3.3
34fa4e5000-34fa4e7000 rw-p 000e5000 08:05 914342                         /lib64/libkrb5.so.3.3
34faa00000-34faa41000 r-xp 00000000 08:05 914351                         /lib64/libgssapi_krb5.so.2.2
34faa41000-34fac41000 ---p 00041000 08:05 914351                         /lib64/libgssapi_krb5.so.2.2
34fac41000-34fac42000 r--p 00041000 08:05 914351                         /lib64/libgssapi_krb5.so.2.2
34fac42000-34fac44000 rw-p 00042000 08:05 914351                         /lib64/libgssapi_krb5.so.2.2
34fae00000-34fae62000 r-xp 00000000 08:05 1050139                        /usr/lib64/libssl.so.1.0.1e
34fae62000-34fb062000 ---p 00062000 08:05 1050139                        /usr/lib64/libssl.so.1.0.1e
34fb062000-34fb066000 r--p 00062000 08:05 1050139                        /usr/lib64/libssl.so.1.0.1e
34fb066000-34fb06c000 rw-p 00066000 08:05 1050139                        /usr/lib64/libssl.so.1.0.1e
3e7e000000-3e7e020000 r-xp 00000000 08:05 932601                         /lib64/ld-2.12.so
3e7e220000-3e7e221000 r--p 00020000 08:05 932601                         /lib64/ld-2.12.so
3e7e221000-3e7e222000 rw-p 00021000 08:05 932601                         /lib64/ld-2.12.so
3e7e222000-3e7e223000 rw-p 00000000 00:00 0 
3e7e400000-3e7e58b000 r-xp 00000000 08:05 932602                         /lib64/libc-2.12.so
3e7e58b000-3e7e78a000 ---p 0018b000 08:05 932602                         /lib64/libc-2.12.so
3e7e78a000-3e7e78e000 r--p 0018a000 08:05 932602                         /lib64/libc-2.12.so
3e7e78e000-3e7e790000 rw-p 0018e000 08:05 932602                         /lib64/libc-2.12.so
3e7e790000-3e7e794000 rw-p 00000000 00:00 0 
3e7e800000-3e7e883000 r-xp 00000000 08:05 932603                         /lib64/libm-2.12.so
3e7e883000-3e7ea82000 ---p 00083000 08:05 932603                         /lib64/libm-2.12.so
3e7ea82000-3e7ea83000 r--p 00082000 08:05 932603                         /lib64/libm-2.12.so
3e7ea83000-3e7ea84000 rw-p 00083000 08:05 932603                         /lib64/libm-2.12.so
3e7ec00000-3e7ec17000 r-xp 00000000 08:05 932607                         /lib64/libpthread-2.12.so
3e7ec17000-3e7ee17000 ---p 00017000 08:05 932607                         /lib64/libpthread-2.12.so
3e7ee17000-3e7ee18000 r--p 00017000 08:05 932607                         /lib64/libpthread-2.12.so
3e7ee18000-3e7ee19000 rw-p 00018000 08:05 932607                         /lib64/libpthread-2.12.so
3e7ee19000-3e7ee1d000 rw-p 00000000 00:00 0 
3e7f000000-3e7f002000 r-xp 00000000 08:05 914390                         /lib64/libdl-2.12.so
3e7f002000-3e7f202000 ---p 00002000 08:05 914390                         /lib64/libdl-2.12.so
3e7f202000-3e7f203000 r--p 00002000 08:05 914390                         /lib64/libdl-2.12.so
3e7f203000-3e7f204000 rw-p 00003000 08:05 914390                         /lib64/libdl-2.12.so
3e7f400000-3e7f407000 r-xp 00000000 08:05 914048                         /lib64/librt-2.12.so
3e7f407000-3e7f606000 ---p 00007000 08:05 914048                         /lib64/librt-2.12.so
3e7f606000-3e7f607000 r--p 00006000 08:05 914048                         /lib64/librt-2.12.so
3e7f607000-3e7f608000 rw-p 00007000 08:05 914048                         /lib64/librt-2.12.sozsh: abort      ../RunTTbarPrediction -c ttbar_met.cfg -o ../moep 

_ROOT Version:/cvmfs/sft.cern.ch/lcg/releases/LCG_94/ROOT/6.14.04/x86_64-slc6-gcc62-opt/bin/root
_Platform: Linux 2.6.32-754.6.3.el6.x86_64
_Compiler: gcc 6.2


It sounds like that file is ‘corrupted’. It might still work if you skip entry 761 from that file.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.