Segfault when reading nonexistent histogram from TFile

EDIT: See my comment below. Nothing to do with RDF or XRootD. Stacktrace is what it is though, so that’s still useful.

When I try and read from the grid with RDataFrame using xrootd, I get a segfault

Thread 6 (Thread 0x7f04521dc700 (LWP 3149194)):
#0  0x0000003e9b0e5499 in syscall () from /lib64/libc.so.6
#1  0x00007f0454cb3f6d in XrdSys::LinuxSemaphore::Wait (this=0x694ed90) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdSys/XrdSysLinuxSemaphore.hh:161
#2  XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x69304d8) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdCl/XrdClSyncQueue.hh:67
#3  XrdCl::JobManager::RunJobs() () at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClJobManager.cc:146
#4  0x00007f0454cb4299 in RunRunnerThread (arg=<optimized out>) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClJobManager.cc:33
#5  0x0000003e9bc07aa1 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003e9b0e8c4d in clone () from /lib64/libc.so.6

Thread 5 (Thread 0x7f0452bdd700 (LWP 3149193)):
#0  0x0000003e9b0e5499 in syscall () from /lib64/libc.so.6
#1  0x00007f0454cb3f6d in XrdSys::LinuxSemaphore::Wait (this=0x694ed90) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdSys/XrdSysLinuxSemaphore.hh:161
#2  XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x69304d8) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdCl/XrdClSyncQueue.hh:67
#3  XrdCl::JobManager::RunJobs() () at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClJobManager.cc:146
#4  0x00007f0454cb4299 in RunRunnerThread (arg=<optimized out>) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClJobManager.cc:33
#5  0x0000003e9bc07aa1 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003e9b0e8c4d in clone () from /lib64/libc.so.6

Thread 4 (Thread 0x7f04535de700 (LWP 3149192)):
#0  0x0000003e9b0e5499 in syscall () from /lib64/libc.so.6
#1  0x00007f0454cb3f6d in XrdSys::LinuxSemaphore::Wait (this=0x694ed90) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdSys/XrdSysLinuxSemaphore.hh:161
#2  XrdCl::SyncQueue<XrdCl::JobManager::JobHelper>::Get (this=0x69304d8) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdCl/XrdClSyncQueue.hh:67
#3  XrdCl::JobManager::RunJobs() () at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClJobManager.cc:146
#4  0x00007f0454cb4299 in RunRunnerThread (arg=<optimized out>) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClJobManager.cc:33
#5  0x0000003e9bc07aa1 in start_thread () from /lib64/libpthread.so.0
#6  0x0000003e9b0e8c4d in clone () from /lib64/libc.so.6

Thread 3 (Thread 0x7f0453fdf700 (LWP 3149191)):
#0  0x0000003e9bc0f00d in nanosleep () from /lib64/libpthread.so.0
#1  0x00007f04551b44dd in XrdSysTimer::Wait (mills=<optimized out>) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdSys/XrdSysTimer.cc:239
#2  0x00007f0454c604d8 in XrdCl::TaskManager::RunTasks() () at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClTaskManager.cc:244
#3  0x00007f0454c605b9 in RunRunnerThread (arg=<optimized out>) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdCl/XrdClTaskManager.cc:37
#4  0x0000003e9bc07aa1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003e9b0e8c4d in clone () from /lib64/libc.so.6

Thread 2 (Thread 0x7f04549e0700 (LWP 3149190)):
#0  0x0000003e9b0e9243 in epoll_wait () from /lib64/libc.so.6
#1  0x00007f04551b9437 in XrdSys::IOEvents::PollE::Begin (this=0x692ca50, syncsem=<optimized out>, retcode=<optimized out>, eTxt=<optimized out>) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/./XrdSys/XrdSysIOEventsPollE.icc:213
#2  0x00007f04551b5c35 in XrdSys::IOEvents::BootStrap::Start(void*) () at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdSys/XrdSysIOEvents.cc:131
#3  0x00007f04551b3cc8 in XrdSysThread_Xeq (myargs=0x6930590) at /mnt/build/jenkins/workspace/lcg_hsf_build/BUILDTYPE/Release/COMPILER/gcc8binutils/LABEL/slc6/build/externals/xrootd-4.8.4/src/xrootd/4.8.4/src/XrdSys/XrdSysPthread.cc:86
#4  0x0000003e9bc07aa1 in start_thread () from /lib64/libpthread.so.0
#5  0x0000003e9b0e8c4d in clone () from /lib64/libc.so.6

Thread 1 (Thread 0x7f046057ebc0 (LWP 3149162)):
#0  0x0000003e9b0ac8dd in waitpid () from /lib64/libc.so.6
#1  0x0000003e9b03e4e9 in do_system () from /lib64/libc.so.6
#2  0x0000003e9b03e820 in system () from /lib64/libc.so.6
#3  0x00007f046553479a in TUnixSystem::StackTrace() () from /cvmfs/sft.cern.ch/lcg/views/LCG_94python3/x86_64-slc6-gcc8-opt/lib/libCore.so
#4  0x00007f0465537074 in TUnixSystem::DispatchSignals(ESignals) () from /cvmfs/sft.cern.ch/lcg/views/LCG_94python3/x86_64-slc6-gcc8-opt/lib/libCore.so
#5  <signal handler called>
#6  0x0000000000412c90 in main ()
===========================================================

EDIT: Same result in LCG 95 (ROOT 6.16, Python 2), and with or without EnableImplicitMT.


ROOT Version: 6.14
Platform: LCG 94 (Python 3)
Compiler: GCC 8.2.0


Hi,

how can we reproduce the faulty behaviour?

Cheers,
D

Sorry, I diagnosed the problem incorrectly. It’s not connected to RDataFrame or XRootD. Rather, I use TFile::Get to open a histogram the file doesn’t contain, and this seems to cause a segfault in the file access code, which is XRootD where the file is accessed in this manner.

Is there a limit to the number of objects a file can contain? The files missing the histogram in question contain a large (circa 60) number of trees for systematic variations.

Hi,

thanks for digging deeper. This behaviour is not expected. Can you share the URI of the file?

Cheers,
D

It’s an ATLAS file:

root://t2se01.physics.ox.ac.uk:1094//dpm/physics.ox.ac.uk/home/atlas/atlaslocalgroupdisk/rucio/user/saparede/83/0e/user.saparede.17153042._000001.MiniNTuple.root

Another strange thing is that a tree I write out with Snapshot after processing these files has about 20% more entries than one I write out manually

Never mind, had an extra filter

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.