Seg. fault for loading the library with dev3

Dear Experts,
I am (slowly) migrating my (C++ & PyROOT) project to ROOT 6/23/01 from dev3 slot

   source /cvmfs/sft.cern.ch/lcg/views/dev3/latest/x86_64-centos7-gcc9-opt/setup.sh

the project includes the dictionary library. Unfortunately starting from Thursday the “latest” slot causes in segfault in the loading the library (while “Wed” slot and earlier are perfectly OK). Before staring to locate and isolate the problem in my code, I’d like to know - if there are some known issues that could cause problems for dev3-slot from Wed to Thu ?


Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Not Provided
Compiler: Not Provided


Hi @ibelyaev,
what is the stacktrace at the time of segfault? Could you provide a simple self-contained reproducer for the issue you are encountering? Nightly builds are not guaranteed to always work, of course, but between Wednesday and Thursday I am not aware of any known issues introduced.

Cheers,
Enrico

Hi Enrico,
preparing a simple test is not easy - my project is rather large. I’ve put large efforts updating ROOT 6.23/01 and I’ve managed with 95% of functionality (using the last week dev3-Wed slot). I’ve made attempt to prepare simple reproducer, but all “simple” cases seems to work. Today I’ve lost the last working configuration - last week dev3-Wed is not accessible anymore and I have no ideas how to process further. Seg. faults appears in accessing C++ files, but in a context-dependent manner…
e.g. I have some python module that interally accessing C++ classes, and “import module” makes segfaults, while “python module.py” is ok… For other modules situation can be reversed…
I suspect some problem with generation of dictionaries.
I am using following function in CMAKE:

REFLEX_GENERATE_DICTIONARY( ostap     ${CMAKE_CURRENT_SOURCE_DIR}/src/dict/Ostap.hh 
                            SELECTION ${CMAKE_CURRENT_SOURCE_DIR}/src/dict/Ostap.xml )

add_library           ( ostapDict MODULE ostap.cxx)
add_dependencies      ( ostapDict ostap-dictgen ostap ROOT::MathMore ROOT::GenVector root_pyroot )
target_link_libraries ( ostapDict               ostap ROOT::MathMore ROOT::GenVector root_pyroot )

I see that in your examples you rely on ROOT_GENERATE_DICTIONARY, but in my case it fails since I’ve found no way to instruct it about the location of include files.

[ btw - unrelated to this particular segfault - what method will you recommend for usage ROOT_GENERATE_DICTIONARIES or REFLEX_GENERATE_DICTIONARIES ? I always used REFLEX one with xml-file… - should I start migration to ROOT_GENERATE_DICTIONARIES?)

[ one more question:
I use

source /cvmfs/sft.cern.ch/lcg/views/dev3/latest/$CMTCONFIG/setup.sh

and then in cmake

find_package(ROOT 6 CONFIG REQUIRED )
message ( "----> ROOT   version   : " ${ROOT_VERSION} )
find_program(ROOT_CONFIG_EXECUTABLE NAMES root-config)
execute_process( COMMAND "${ROOT_CONFIG_EXECUTABLE}" --python2-version
                 OUTPUT_VARIABLE PY2VERSION_ROOT
                 OUTPUT_STRIP_TRAILING_WHITESPACE )
if     (PY2VERSION_ROOT)
  find_package(Python2 ${PY2VERSION_ROOT} COMPONENTS Interpreter Development NumPy)
endif() 

And I get warning from cmake:

-- Could NOT find Python2: Found unsuitable version "2.7.13", but required is at least "2.7.16" (found /cvmfs/lhcb.cern.ch/lib/var/lib/LbEnv/941/stable/x86_64-centos7/bin/python2.7)

Likely it does not cause th reproblmem, since for “good” case of dev3-last-week-Wed this message was also in place… But anyhow - likely I am doing something wrong. Is my treatment of python/ROOT in cmake correct? ]

Seg fault itself is not very informative

 python ../ostap/core/pyrouts.py 
# ostap.core.pyrouts             INFO    Zillions of decorations for ROOT/RooFit objects
 *** Break *** segmentation violation



===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================

Thread 2 (Thread 0x7fc989c38700 (LWP 6882)):
#0  0x00007fc9a5bd4b3b in do_futex_wait.constprop () from /lib64/libpthread.so.0
#1  0x00007fc9a5bd4bcf in __new_sem_wait_slow.constprop.0 () from /lib64/libpthread.so.0
#2  0x00007fc9a5bd4c6b in sem_wait

GLIBC_2.2.5 () from /lib64/libpthread.so.0
#3  0x00007fc9a5f1b858 in PyThread_acquire_lock (lock=lock
entry=0x5f3d780, waitflag=waitflag
entry=1) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/thread_pthread.h:356
#4  0x00007fc9a5ed9aa6 in PyEval_RestoreThread (tstate=tstate
entry=0x113b23f0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:359
#5  0x00007fc98f0554fe in floatsleep (secs=<optimized out>) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Modules/timemodule.c:1057
#6  time_sleep (self=<optimized out>, args=<optimized out>) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Modules/timemodule.c:206
#7  0x00007fc9a5ee2777 in call_function (oparg=<optimized out>, pp_stack=0x7fc989c37598) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:4376
#8  PyEval_EvalFrameEx (f=f
entry=0x7fc98ec17e10, throwflag=throwflag
entry=0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:3013
#9  0x00007fc9a5ee4638 in PyEval_EvalCodeEx (co=<optimized out>, globals=<optimized out>, locals=locals
entry=0x0, args=args
entry=0x7fc98ec1bae8, argcount=<optimized out>, kws=kws
entry=0x7fc9a639f068, kwcount=0, defs=0x0, defcount=0, closure=0x0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:3608
#10 0x00007fc9a5e62a0f in function_call (func=0x7fc98ec22578, arg=0x7fc98ec1bad0, kw=0x7fc99000f6e0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Objects/funcobject.c:523
#11 0x00007fc9a5e36f93 in PyObject_Call (func=func
entry=0x7fc98ec22578, arg=arg
...

another part is

The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#7  0x00007fc99fc8256d in TClingMemberIter::Advance() () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libCling.so
#8  0x00007fc99fc865da in TClingMethodInfo::TClingMethodInfo(cling::Interpreter*, TClingClassInfo*) () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libCling.so
#9  0x00007fc99fbec29d in TCling::MethodInfo_Factory(ClassInfo_t*) const () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libCling.so
#10 0x00007fc9a452af95 in TListOfFunctions::Load() () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libCore.so
#11 0x00007fc9a4500774 in TClass::GetListOfMethods(bool) () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libCore.so
#12 0x00007fc9a49d9412 in Cppyy::GetNumMethods(unsigned long) () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libcppyy_backend2_7.so
#13 0x00007fc9a4c66df0 in CPyCppyy::BuildScopeProxyDict(unsigned long, _object*) () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libcppyy2_7.so
#14 0x00007fc9a4c69c73 in CPyCppyy::CreateScopeProxy(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, _object*) () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libcppyy2_7.so
#15 0x00007fc9a4c4e1a5 in CPyCppyy::meta_getattro(_object*, _object*) () from /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Wed/x86_64-centos7-gcc9-opt/lib/libcppyy2_7.so
#16 0x00007fc9a5edc89a in PyEval_EvalFrameEx (f=f
entry=0x1664dc20, throwflag=throwflag
entry=0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:2559
#17 0x00007fc9a5ee4638 in PyEval_EvalCodeEx (co=co
entry=0x7fc96751cf30, globals=globals
entry=0x7fc96742c5c8, locals=locals
entry=0x7fc96742c5c8, args=args
entry=0x0, argcount=argcount
entry=0, kws=kws
entry=0x0, kwcount=0, defs=0x0, defcount=0, closure=0x0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:3608
#18 0x00007fc9a5ee47e9 in PyEval_EvalCode (co=co
entry=0x7fc96751cf30, globals=globals
entry=0x7fc96742c5c8, locals=locals
entry=0x7fc96742c5c8) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/ceval.c:669
#19 0x00007fc9a5ef8f24 in PyImport_ExecCodeModuleEx (name=name
entry=0x165fdaa0 "ostap.math.models", co=co
entry=0x7fc96751cf30, pathname=pathname
entry=0x165feab0 "/afs/cern.ch/user/i/ibelyaev/cmtuser/RELEASE/ostap_23/build/INSTALL/LCGdev3/x86_64-centos7-gcc9-opt/ostap/math/models.py") at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/import.c:753
#20 0x00007fc9a5ef926e in load_source_module (name=name
entry=0x165fdaa0 "ostap.math.models", pathname=0x165feab0 "/afs/cern.ch/user/i/ibelyaev/cmtuser/RELEASE/ostap_23/build/INSTALL/LCGdev3/x86_64-centos7-gcc9-opt/ostap/math/models.py", fp=<optimized out>) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/import.c:1143
#21 0x00007fc9a5ef9f12 in load_module (name=name
entry=0x165fdaa0 "ostap.math.models", fp=<optimized out>, pathname=pathname
entry=0x165feab0 "/afs/cern.ch/user/i/ibelyaev/cmtuser/RELEASE/ostap_23/build/INSTALL/LCGdev3/x86_64-centos7-gcc9-opt/ostap/math/models.py", type=<optimized out>, loader=<optimized out>) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/import.c:1950
#22 0x00007fc9a5efa1a1 in import_submodule (mod=mod
entry=0x7fc98ec2a7c0, subname=0x165fdaab "models", fullname=fullname
entry=0x165fdaa0 "ostap.math.models") at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/import.c:2747
#23 0x00007fc9a5efac94 in load_next (p_buflen=<synthetic pointer>, buf=0x165fdaa0 "ostap.math.models", p_name=<synthetic pointer>, altmod=0x7fc98ec2a7c0, mod=0x7fc98ec2a7c0) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/import.c:2561
#24 import_module_level (locals=<optimized out>, level=<optimized out>, fromlist=0x7fc9a61a8e00 <_Py_NoneStruct>, globals=<optimized out>, name=<optimized out>) at /workspace/build/externals/Python-2.7.16/src/Python/2.7.16/Python/import.c:2278

It looks like you scripts uses multiple threads and that the crash is in an internal ROOT routine that is usually quite safe. Did you enable support for thread safety in ROOT:
ROOT::EnableThreadSafety(); ?

No (explicit) multithreading is used.
but I’ve added call to

ROOT.ROOT.EnableThreadSafety()

to all my 117 failing tests
and it does not affect the failure…

Fair enough. Can you tell me which commit you use to build ROOT? (I.e. I just notice you use the 6.23/01 and there was indeed a new change in those failing internal routine.

Can you provide a reproduce so that we can investigate?

I am using ROOT from lcg-nightlies, dev3/Mon slot,

/cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Mon/ROOT/HEAD/x86_64-centos7-gcc9-opt

just to repeat the start of the thread - dev3/Wed-last-week slot and before was OK, but starting from de3/Thu the scripts fails with segfault.

Sorry, missed that part. So indeed this is related to the new change and we need a reproducer to help remove the regression.

Thanks,
Philippe.

Hi Philippe,
It is not so easy to make a reproduced. The project is large
and I do not see a good candidate for reproducer.
I’ll try to prepare something tomorrow, but it is not easy.
cheers, Vanya

It’s likely https://github.com/root-project/root/issues/6359 - so Vanya, you can just wait until I’ve fixed that, and then confirm the issue you see is fixed, too, or provide a reproducer then!

Hi Axel,
Thank you very much! I’ll wait.
cheers, Vanya

Dear Axel,
I confirm that now there is no more segfaults.
Thank you very much

2 Likes

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.