Bus error in code that previously worked

I have some code that worked fine yesterday, but today, on the same files, it produces a Bus error. The relevant portion of the stack trace is:

#6  0x000000000048c27c in ROOT::Detail::RDF::RCustomColumn<reconstructed_event (*)(std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::TCCHelperTypes::TNothing>::Update(unsigned int, long long) ()
#7  0x000000000048bd4e in reconstructed_event& ROOT::Internal::RDF::TColumnValue<reconstructed_event, false>::Get<reconstructed_event, 0>(long long) ()
#8  0x000000000044795b in ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> > const&, std::vector<float, std::allocator<float> > const&), ROOT::Detail::RDF::RFilter<main::{lambda(int, int)#3}, ROOT::Detail::RDF::RLoopManager> > > >::CheckFilters(unsigned int, long long) ()
#9  0x00000000004479ea in ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> > const&, std::vector<float, std::allocator<float> > const&), ROOT::Detail::RDF::RFilter<main::{lambda(int, int)#3}, ROOT::Detail::RDF::RLoopManager> > > > >::CheckFilters(unsigned int, long long) ()
#10 0x0000000000448d8c in ROOT::Detail::RDF::RFilter<TriggerTest, ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&, std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> > const&, std::vector<float, std::allocator<float> > const&), ROOT::Detail::RDF::RFilter<main::{lambda(int, int)#3}, ROOT::Detail::RDF::RLoopManager> > > > > > >::CheckFilters(unsigned int, long long) ()
#11 0x000000000044a21c in ROOT::Internal::RDF::RAction<ROOT::Internal::RDF::ForeachSlotHelper<void write_tree<ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<TriggerTest, ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&, std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> > const&, std::vector<float, std::allocator<float> > const&), ROOT::Detail::RDF::RFilter<main::{lambda(int, int)#3}, ROOT::Detail::RDF::RLoopManager> > > > > > > > >(ROOT::RDF::RInterface<ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<TriggerTest, ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&, std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> > const&, std::vector<float, std::allocator<float> > const&), ROOT::Detail::RDF::RFilter<main::{lambda(int, int)#3}, ROOT::Detail::RDF::RLoopManager> > > > > > > >, void>&, char const*, TFile&)::{lambda(unsigned int, reconstructed_event const&, int, long long, double)#1}>, ROOT::Detail::RDF::RFilter<TriggerTest, ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&, std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(reconstructed_event const&), ROOT::Detail::RDF::RFilter<bool (*)(std::vector<int, std::allocator<int> > const&, std::vector<float, std::allocator<float> > const&, std::vector<float, std::allocator<float> > const&), ROOT::Detail::RDF::RFilter<main::{lambda(int, int)#3}, ROOT::Detail::RDF::RLoopManager> > > > > > >, ROOT::TypeTraits::TypeList<reconstructed_event, int, long long, double> >::Run(unsigned int, long long) ()
#12 0x00007fdc83e25fea in ROOT::Detail::RDF::RLoopManager::RunAndCheckFilters(unsigned int, long long) () from /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/lib/libROOTDataFrame.so
#13 0x00007fdc83e26e5a in std::_Function_handler<void (TTreeReader&), ROOT::Detail::RDF::RLoopManager::RunTreeProcessorMT()::{lambda(TTreeReader&)#1}>::_M_invoke(std::_Any_data const&, TTreeReader&) () from /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/lib/libROOTDataFrame.so
#14 0x00007fdc837f7833 in std::_Function_handler<void (unsigned int), void ROOT::TThreadExecutor::Foreach<ROOT::TTreeProcessorMT::Process(std::function<void (TTreeReader&)>)::{lambda(ROOT::Internal::EntryCluster const&)#1}, ROOT::Internal::EntryCluster>(ROOT::TTreeProcessorMT::Process(std::function<void (TTreeReader&)>)::{lambda(ROOT::Internal::EntryCluster const&)#1}, std::vector<ROOT::Internal::EntryCluster, std::allocator<std::vector> > const&)::{lambda(unsigned int)#1}>::_M_invoke(std::_Any_data const&, unsigned int&&) () from /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/lib/libTreePlayer.so
#15 0x00007fdc8578f503 in tbb::interface9::internal::start_for<tbb::blocked_range<unsigned int>, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() () from /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/lib/libImt.so
#16 0x00007fdc815461d5 in tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all (this=0x7fdc75be3e00, parent=..., child=<optimized out>) at ../../include/tbb/machine/gcc_ia32_common.h:79
#17 0x00007fdc8153fd40 in tbb::internal::arena::process (this=0x7fdc75a5a200, s=...) at ../../src/tbb/arena.cpp:160
#18 0x00007fdc8153e8fb in tbb::internal::market::process (this=0x7fdc75c43580, j=...) at ../../src/tbb/market.cpp:693
#19 0x00007fdc8153af70 in tbb::internal::rml::private_worker::run (this=0x7fdc75a56f00) at ../../src/tbb/private_server.cpp:270
#20 0x00007fdc8153b199 in tbb::internal::rml::private_worker::thread_routine (arg=<optimized out>) at ../../src/tbb/private_server.cpp:223
#21 0x0000003e9bc07aa1 in start_thread () from /lib64/libpthread.so.0
#22 0x0000003e9b0e8c4d in clone () from /lib64/libc.so.6

ROOT Version: 6.14/04
Platform: x86_64 Linux (SLC6 LCG 94)
Compiler: gcc 8.1.0


bus errors are tricky.
one relatively common source of such a thing is a .so disappearing under the feet of the executing program. (e.g. the folder containing the .so file holding the code that is being executed suddenly becomes unreadable or unreachable, perhaps b/c that file has been replaced/modified in an incompatible way, or because your AFS token or CVMFS authentification token has expired.)

see:

I’ve tried re-running a few times, including rebuilding, and opening a fresh shell, and it’s consistently giving a bus error now. The build always succeeds and the program loads each time so I don’t see how it could be an expired token.

I get a couple of warnings in ROOT headers with -Wcast-align=strict. I don’t think this is the issue, since I’m on x86, but they should probably be fixed anyway:

/cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/TBranchProxy.h: In member function ‘void* ROOT::Detail::TBranchProxy::GetClaStart(UInt_t)’:
/cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/TBranchProxy.h:276:38: warning: cast from ‘char*’ to ‘void**’ increases required alignment of target type [-Wcast-align]
             return *(void**)(location);
                                      ^
/cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/TBranchProxy.h: In member function ‘void* ROOT::Detail::TBranchProxy::GetStlStart(UInt_t)’:
/cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/TBranchProxy.h:320:38: warning: cast from ‘char*’ to ‘void**’ increases required alignment of target type [-Wcast-align]
             return *(void**)(location);

Hi,

what code are you running? Can we reproduce the error?

Cheers,
D

The code is at https://gitlab.cern.ch/hh4b/hh4b-resolved-reconstruction/tree/sculpt-investigation.

On one of my gdb runs, I get:

0x0000000000489a0e in ROOT::Detail::RDF::RCustomColumn<reconstructed_event (*)(std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::TCCHelperTypes::TNothing>::Update (this=0x404099b200000000, slot=1062728499,
    entry=-4597173905284114168) at /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/ROOT/RDFNodes.hxx:515

I’m pretty sure the entry number isn’t supposed to go negative, so I think there may be a bug in how it is determined.

EDIT: Here’s the top of the GDB backtrace

#0  0x0000000000489a0e in ROOT::Detail::RDF::RCustomColumn<reconstructed_event (*)(std::vector<Jet, std::allocator<Jet> > const&), ROOT::Detail::RDF::TCCHelperTypes::TNothing>::Update (this=0x404099b200000000, slot=1062728499,
    entry=-4597173905284114168) at /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/ROOT/RDFNodes.hxx:515
#1  0x00000000004894ce in ROOT::Internal::RDF::TColumnValue<reconstructed_event, false>::Get<reconstructed_event, 0> (this=0x85559f0, entry=<optimized out>)
    at /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/ROOT/RDFNodes.hxx:852
#2  0x000000000044417b in ROOT::Detail::RDF::RFilter<bool (*)(const reconstructed_event&), ROOT::Detail::RDF::RFilter<bool (*)(const std::vector<int, std::allocator<int> >&, const std::vector<float, std::allocator<float> >&, const std::vector<float, std::allocator<float> >&), ROOT::Detail::RDF::RFilter<main(int, char**)::<lambda(Int_t, Int_t)>, ROOT::Detail::RDF::RLoopManager> > >::CheckFilterHelper<0> (entry=2281733, slot=<optimized out>, this=0xc6a4b80)
    at /cvmfs/sft.cern.ch/lcg/views/LCG_94/x86_64-slc6-gcc8-opt/include/ROOT/RDFNodes.hxx:653

Somehow, between GetFilterHelper and RCustomColumn::Update, the value of entry has been completely mangled. Looking at the size of slot, that may have been mangled too.

Hi,

thanks for the link. I get a 404 unfortunately. Can this be reproduced by a standalone program?

Cheers,
D

Solved the issue. It turns out I was copying five objects into a four object array, which clobbered the following memory.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.