Seg fault with ROOT 6.28

Hello.
I’d like to follow up on the issue reported earlier here. Were you able to look at it, @eguiraud ?

Hello @oguz.guzel ,

unfortunately I don’t really have much to add, without a self-contained reproducer that we can run on our side or at least a full stacktrace that might point to where the crash happens in the C++ side of things there is very little to go on :confused:

Hi,

I’m not sure if I can provide you a reproducible example without installing the library itself. But in case you want to try, the instructions are straightforward as the followings, on lxplus or any cvmfs installed machine, where the last line is the command to run a simple example which should give you the segmentation error,

mkdir bamboodev
cd bamboodev
# make a virtualenv
source /cvmfs/sft.cern.ch/lcg/views/LCG_103/x86_64-centos7-gcc12-opt/setup.sh
python -m venv bamboovenv
source bamboovenv/bin/activate
# clone and install bamboo
git clone https://gitlab.cern.ch/cp3-cms/bamboo.git
pip install ./bamboo
# clone and install plotIt
git clone https://github.com/cp3-llbb/plotIt.git
mkdir build-plotit
cd build-plotit
cmake -DCMAKE_INSTALL_PREFIX=$VIRTUAL_ENV ../plotIt
make -j10 install
cd ../bamboo
pip install git+https://gitlab.cern.ch/cp3-cms/CMSJMECalculators.git
bambooRun --module=examples/nanozmumu.py:NanoZMuMu --distributed=worker --sample=DY_M50_test --anaConfig=examples/test1.yml tests/data/DY_M50_2016.root --output=test_worker_2.root

Thanks! Can bamboo be installed on a system without cvmfs? I guess pip should pick up all required dependencies and install the ones that are missing?

Hi again, I can reproduce the crash on LXPLUS, and attaching gdb to the process from a separate terminal, using the -gdb LCG view flavor instead of the -opt one, I get some potentially interesting output before the crash:

...
INFO:bamboo.workflow:Backend graph construction done in 41.07s, max RSS: 1316.57MB
INFO:bamboo.workflow:Starting to fill plots (and skims)
RDF event loop started to process entries
WARNING:bamboo.plots:Unsupported product type for data-driven: Skim, additional products will not be stored
Error in <TFile::cd>: No such file root://eoshome-e.cern.ch//eos/user/e/eguiraud/debugging/bamboo_crash/bamboo/test_worker_2.root:/
TTree::CopyEntries:0: RuntimeWarning: The output TTree (muSkim) must be associated with a writable file (/tmp/eguiraud/tmpayefpl7a/skim_muSkim.root).
Error in <TTree::CloneTTree>: TTree has not been cloned

Note the extra :/ at the end of the path into which TFile::cd tries to cd. I don’t know where the bogus path comes from yet, but in case you have any clue…an educated guess is that a TDirectory name somewhere somehow is empty but it is expected to be non-empty.

Is it possible to dump the RDF code that the bamboo invocation produce? It would make it possible to cut out a lot of layers from the reproducer and possibly run it on my workstation as well, where I have a better debugging setup than LXPLUS (see next message for why I can’t run this on my workstation at the moment).

Unrelated to the original problem, but the CMSJet library does not compile with gcc 13 and that makes it impossible for me to run the reproducer on my workstation. A header is missing:

      FAILED: CMakeFiles/CMSJMECalculators.dir/CMSJet/src/SimpleJetCorrectionUncertainty.cc.o
      /usr/bin/c++ -DCMSJMECalculators_EXPORTS -I/tmp/pip-req-build-_r0vbtfa/interface -I/tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/include -I/tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/src -isystem /home/blue/ROOT/relwithdebinfo-perf/cmake-build/install/include -O3 -DNDEBUG -fPIC -MD -MT CMakeFiles/CMSJMECalculators.dir/CMSJet/src/SimpleJetCorrectionUncertainty.cc.o -MF CMakeFiles/CMSJMECalculators.dir/CMSJet/src/SimpleJetCorrectionUncertainty.cc.o.d -o CMakeFiles/CMSJMECalculators.dir/CMSJet/src/SimpleJetCorrectionUncertainty.cc.o -c /tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/src/SimpleJetCorrectionUncertainty.cc
      In file included from /tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/include/JetCorrectorParameters.h:10,
                       from /tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/src/SimpleJetCorrectionUncertainty.cc:2:
      /tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/include/Utilities.h: In member function ‘std::hash_specialization<Head, ndims>::result_type std::hash_specialization<Head, ndims>::operator()(const argument_type&) const’:
      /tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/include/Utilities.h:84:13: error: ‘uint32_t’ does not name a type
         84 |       const uint32_t& b = reinterpret_cast<const uint32_t&>(std::get<0>(t));
            |             ^~~~~~~~
      /tmp/pip-req-build-_r0vbtfa/_skbuild/linux-x86_64-3.11/cmake-build/CMSJet/include/Utilities.h:11:1: note: ‘uint32_t’ is defined in header ‘<cstdint>’; did you forget to ‘#include <cstdint>’?

Would it be possible to patch CMSJet on my end and tell skbuild to pick up the patched version? If not, would it be possible to fix this in CMSJet and release a patched version? :smiley:

Hi, thanks for trying out. Yes, bamboo should install all the dependencies via pip install bamboo-hep. Python3 and ROOT v > 6.20 should be okay to install. You can also check this documentation for installation.

As for the :/ appendix, I have no idea but I will look into it. And will check if we can see the RDF code.

And, for the CMSJetMet library, I have no idea how to fix that issue but we can safely comment out the part using that library in the nanozmumu.py file, i.e. L134-158. Error should still be produced.

The CMSJet error happens at the pip install bamboo step unfortunately, during compilation of the library. So I would need to be able to tell pip install bamboo to pick a different version of CMSJet, patched to add the missing include.

About the :/, it looks like something at some point wanted to say file.root:subdirectory/treename but for some reason subdirectory and treename expanded to empty strings. Just a guess. Seeing the RDF invocations might clarify what’s happening.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.