Corrupted root file check

Dear ROOTers
I have a lot of files in directory. Some of them are corrupted. I’m trying to check those files, I’m doing this by using functions like IsZombie, I also check if file contains tree.
However my macro crash. I also cannot open file by root (root filename) because “pure ROOT” also crash. Is there another way to skip such bad file?


Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Can you give an example of the stack trace where just opening the file crashes?

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f0a7aecc41c in waitpid () from /usr/lib64/libc.so.6
#1  0x00007f0a7ae49f12 in do_system () from /usr/lib64/libc.so.6
#2  0x00007f0a7bf57974 in TUnixSystem::StackTrace() () from /opt/fairsoft/bmn/jun19p1/lib/root/libCore.so.6.16
#3  0x00007f0a7bf5a0ac in TUnixSystem::DispatchSignals(ESignals) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCore.so.6.16
#4  <signal handler called>
#5  0x00007f0a7be72efa in TString::ReadBuffer(char*&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCore.so.6.16
#6  0x00007f0a79cc1047 in TDirectoryFile::ReadKeys(bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#7  0x00007f0a79cdb69c in TFile::Init(bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#8  0x00007f0a79cdbf61 in TFile::TFile(char const*, char const*, char const*, int) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#9  0x00007f0a79cdd635 in TFile::Open(char const*, char const*, char const*, int, int) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#10 0x00007f0a7c5e1048 in ?? ()
#11 0x00007f0a79cdc2b0 in ?? () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#12 0x0000006576f91ca7 in ?? ()
#13 0x00007ffec023afc0 in ?? ()
#14 0x00007f0a7c5e1070 in ?? ()
#15 0x00000000011706b0 in ?? ()
#16 0x00007f0a76fe8bd2 in cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction const&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#17 0x00007f0a76f956e3 in cling::Interpreter::executeTransaction(cling::Transaction&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#18 0x00007f0a76ff483a in cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling::Transaction*, 2u, llvm::PointerLikeTypeTraits<cling::Transaction*> > >&, bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#19 0x00007f0a76ff7475 in cling::IncrementalParser::Compile(llvm::StringRef, cling::CompilationOptions const&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#20 0x00007f0a76f93950 in cling::Interpreter::EvaluateInternal(std::string const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#21 0x00007f0a76f93bfe in cling::Interpreter::process(std::string const&, cling::Value*, cling::Transaction**, bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#22 0x00007f0a77030598 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#23 0x00007f0a76f095ba in HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#24 0x00007f0a76f1fdb7 in TCling::ProcessLine(char const*, TInterpreter::EErrorCode*) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#25 0x00007f0a7be24aef in TApplication::ProcessLine(char const*, bool, int*) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCore.so.6.16
#26 0x00007f0a7c296e2f in TRint::ProcessLineNr(char const*, char const*, int*) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRint.so.6.16
#27 0x00007f0a7c29866f in TRint::Run(bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRint.so.6.16
#28 0x00000000004008da in main ()
===========================================================


The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#5  0x00007f0a7be72efa in TString::ReadBuffer(char*&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCore.so.6.16
#6  0x00007f0a79cc1047 in TDirectoryFile::ReadKeys(bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#7  0x00007f0a79cdb69c in TFile::Init(bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#8  0x00007f0a79cdbf61 in TFile::TFile(char const*, char const*, char const*, int) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#9  0x00007f0a79cdd635 in TFile::Open(char const*, char const*, char const*, int, int) () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#10 0x00007f0a7c5e1048 in ?? ()
#11 0x00007f0a79cdc2b0 in ?? () from /opt/fairsoft/bmn/jun19p1/lib/root/libRIO.so
#12 0x0000006576f91ca7 in ?? ()
#13 0x00007ffec023afc0 in ?? ()
#14 0x00007f0a7c5e1070 in ?? ()
#15 0x00000000011706b0 in ?? ()
#16 0x00007f0a76fe8bd2 in cling::IncrementalExecutor::runStaticInitializersOnce(cling::Transaction const&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#17 0x00007f0a76f956e3 in cling::Interpreter::executeTransaction(cling::Transaction&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#18 0x00007f0a76ff483a in cling::IncrementalParser::commitTransaction(llvm::PointerIntPair<cling::Transaction*, 2u, cling::IncrementalParser::EParseResult, llvm::PointerLikeTypeTraits<cling::Transaction*>, llvm::PointerIntPairInfo<cling::Transaction*, 2u, llvm::PointerLikeTypeTraits<cling::Transaction*> > >&, bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#19 0x00007f0a76ff7475 in cling::IncrementalParser::Compile(llvm::StringRef, cling::CompilationOptions const&) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#20 0x00007f0a76f93950 in cling::Interpreter::EvaluateInternal(std::string const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#21 0x00007f0a76f93bfe in cling::Interpreter::process(std::string const&, cling::Value*, cling::Transaction**, bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#22 0x00007f0a77030598 in cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
#23 0x00007f0a76f095ba in HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) () from /opt/fairsoft/bmn/jun19p1/lib/root/libCling.so
===========================================================

Can you run the failing file open under valgrind? For example:

valgrind --suppresions=$ROOTSYS/etc/valgrind-root.supp root.exe -b -l -q -e 'TFile::Open("badfilename.root")'

I’ve got this:

==23667== Memcheck, a memory error detector
==23667== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==23667== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==23667== Command: root.exe -b -l -q -e TFile::Open("AuAu_ecm11.5GeV_hydroON_EoSXPT_0-16fm_1000ev_13581-0.reco.MiniDst.root")
==23667== 

==23667== Warning: set address range perms: large range [0x59e43040, 0xcb864600) (undefined)
==23667== Invalid read of size 1
==23667==    at 0x5406B2B: frombuf(char*&, unsigned char*) (Bytes.h:289)
==23667==    by 0x5406B7E: frombuf(char*&, char*) (Bytes.h:442)
==23667==    by 0x5402A4A: TString::ReadBuffer(char*&) (TString.cxx:1260)
==23667==    by 0x7F66880: TKey::ReadKeyBuffer(char*&) (TKey.cxx:1247)
==23667==    by 0x7F11EC1: TDirectoryFile::ReadKeys(bool) (TDirectoryFile.cxx:1380)
==23667==    by 0x7F26F2B: TFile::Init(bool) (TFile.cxx:808)
==23667==    by 0x7F254B6: TFile::TFile(char const*, char const*, char const*, int) (TFile.cxx:519)
==23667==    by 0x7F367E5: TFile::Open(char const*, char const*, char const*, int, int) (TFile.cxx:4132)
==23667==    by 0x402C064: ???
==23667==    by 0x8E343F2: executeWrapper (IncrementalExecutor.h:196)
==23667==    by 0x8E343F2: cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) [clone .part.331] (Interpreter.cpp:1019)
==23667==    by 0x8E34922: cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (Interpreter.cpp:1274)
==23667==    by 0x8E35126: cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) (Interpreter.cpp:723)
==23667==  Address 0x73aa83f is 0 bytes after a block of size 159 alloc'd
==23667==    at 0x4C3089F: operator new[](unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==23667==    by 0x7F626A4: TKey::TKey(long long, int, TDirectory*) (TKey.cxx:186)
==23667==    by 0x7F11E74: TDirectoryFile::ReadKeys(bool) (TDirectoryFile.cxx:1377)
==23667==    by 0x7F26F2B: TFile::Init(bool) (TFile.cxx:808)
==23667==    by 0x7F254B6: TFile::TFile(char const*, char const*, char const*, int) (TFile.cxx:519)
==23667==    by 0x7F367E5: TFile::Open(char const*, char const*, char const*, int, int) (TFile.cxx:4132)
==23667==    by 0x402C064: ???
==23667==    by 0x8E343F2: executeWrapper (IncrementalExecutor.h:196)
==23667==    by 0x8E343F2: cling::Interpreter::RunFunction(clang::FunctionDecl const*, cling::Value*) [clone .part.331] (Interpreter.cpp:1019)
==23667==    by 0x8E34922: cling::Interpreter::EvaluateInternal(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::CompilationOptions, cling::Value*, cling::Transaction**, unsigned long) (Interpreter.cpp:1274)
==23667==    by 0x8E35126: cling::Interpreter::process(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, cling::Value*, cling::Transaction**, bool) (Interpreter.cpp:723)
==23667==    by 0x8EF411C: cling::MetaProcessor::process(llvm::StringRef, cling::Interpreter::CompilationResult&, cling::Value*, bool) (MetaProcessor.cpp:341)
==23667==    by 0x8CE82E7: HandleInterpreterException(cling::MetaProcessor*, char const*, cling::Interpreter::CompilationResult&, cling::Value*) (TCling.cxx:2163)
==23667== 

 *** Break *** segmentation violation
#0  0x000000005810a9cb in ?? ()
#1  0x00000000580a0c8e in ?? ()
#2  0x000000005809d40b in ?? ()
#3  0x000000005809eb37 in ?? ()
#4  0x00000000580aecd1 in ?? ()
#5  0x0000000000000000 in ?? ()
Root > ==23667== 
==23667== HEAP SUMMARY:
==23667==     in use at exit: 1,936,589,640 bytes in 46,403 blocks
==23667==   total heap usage: 155,805 allocs, 109,402 frees, 1,984,118,472 bytes allocated
==23667== 

Unfortunately, at the moment, there is no easy way to detect and/or prevent that exact kind of error. The information to prevent the crash is available but is not yet used in the TString I/O routine (So it will require an update to the ROOT code)

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.