Empty stack error during RDataframe Snapshot

With multithreading enabled, there is a high chance (but not always) of getting an empty stack error when attempting RDataFrame Snapshot with multiple input root files in which some of root files may have bad clustering / clustering size.

The error is as follows

Fatal: !fStack.empty() && "Trying to pop a slot from an empty stack!" violated at line 33 of `/home/conda/feedstock_root/build_artifacts/root_1562589567833/work/root-source/tree/dataframe/src/RSlotStack.cxx'

Condition for error to occur:

  1. Enabled Multithreading
  2. Included multiple root files in RDataFrame
  3. Some of root files may have bad clustering / clustering size (Not entirely sure this is a necessary condition for triggering the error)

A minimal reproducing code is

ROOT::EnableImplicitMT();
for (UInt_t i = 0 ; i < 100; i++){
	std::vector<std::string> input_files;
	input_files.emplace_back("test1.root");
	input_files.emplace_back("test2.root");	
	ROOT::RDataFrame("MonoH_Nominal",input_files).Snapshot("test","test.root");
}

The clustering information for the root files concerned is
For test1.root

Cluster Range #  Entry Start      Last Entry        Size
0                0                105994            7976

For test2.root

Cluster Range #  Entry Start      Last Entry        Size
0                0                2883                 0

The related root files can be viewed at https://cernbox.cern.ch/index.php/s/1pGFlHFDwp8Mcqf


ROOT Version: 6.18.00
Platform: lxplus
Compiler: gcc 7.3.0


Hi @AlkaidC,
thank you for the clear reproducer, we’ll take a look at it as soon as possible.

As you are running on lxplus, could you check whether you see this problem with a nightly build of ROOT master branch? Here are instructions to setup the nightlies from cvmfs.

Cheers,
Enrico

P.S.
just to be clear: users should never see the error message you posted, independently of how messed up the input files are: this is a RDataFrame bug.

EDIT:
this is now https://sft.its.cern.ch/jira/browse/ROOT-10269

Yes, the problem is still there with the nightly build of ROOT. The nightly build I used was (https://root.cern/download/nightly/root_v6.18.99.Linux-centos7-x86_64-gcc4.8.tar.gz) 2019-08-14 03:45

Hi @AlkaidC,
thank you for double-checking! This is clearly a bug in RDataFrame. The cause is clear (although a bit technical, see my write-up in the jira ticket) but the fix requires a bit of work. Please ping me or @Axel if this is a blocking issue for you, it might help move it up the to-do list.

Cheers,
Enrico

Hi @AlkaidC,
I just merged a patch that should resolve the crash in your case and largely mitigate the problem in RDF.

It would be great if tomorrow/next week you could try to reproduce the crash with a nightly build of ROOT again. Hopefully you will not see any issue anymore.

Cheers,
Enrico

Hi @eguiraud,

I tried to run the code using the nightly build but got the following linking error:

/cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Fri/ROOT/HEAD/x86_64-centos7-gcc9-opt/bin/root.exe: symbol lookup error: /cvmfs/sft-nightlies.cern.ch/lcg/nightlies/dev3/Fri/ROOT/HEAD/x86_64-centos7-gcc9-opt/lib/libImt.so: undefined symbol: _ZN3tbb10interface78internal20isolate_within_arenaERNS1_13delegate_baseEl

The way I setup the nightly build is as follows

source /cvmfs/sft.cern.ch/lcg/nightlies/dev3/Fri/gcc/9.2.0/x86_64-centos7/setup.sh
source /cvmfs/sft.cern.ch/lcg/nightlies/dev3/Fri/ROOT/HEAD/x86_64-centos7-gcc9-opt/bin/thisroot.sh

Is there anything that I am missing? Thanks.

Uhm i don’t know, maybe @axel or @amadio do

@AlkaidC You probably have a different TBB enabled with $LD_LIBRARY_PATH than the one from LCG. Can you please paste here the output of echo $LD_LIBRARY_PATH | tr ':' '\n'?

@amadio The output is

/cvmfs/sft.cern.ch/lcg/nightlies/dev3/Fri/ROOT/HEAD/x86_64-centos7-gcc9-opt/lib
/cvmfs/sft.cern.ch/lcg/releases/gcc/9.2.0-afc57/x86_64-centos7/lib
/cvmfs/sft.cern.ch/lcg/releases/gcc/9.2.0-afc57/x86_64-centos7/lib64
/cvmfs/sft.cern.ch/lcg/releases/binutils/2.30-e5b21/x86_64-centos7/lib

That looks fine, what is the output of ldd $(root-config --libdir)/libImt.so?

@amadio The output is

        linux-vdso.so.1 =>  (0x00007fff17bf8000)
        libThread.so => /cvmfs/sft.cern.ch/lcg/nightlies/dev3/Fri/ROOT/HEAD/x86_64-centos7-gcc9-opt/lib/libThread.so (0x00007f99c2348000)
        libtbb.so.2 => /lib64/libtbb.so.2 (0x00007f99c2113000)
        libCore.so => /cvmfs/sft.cern.ch/lcg/nightlies/dev3/Fri/ROOT/HEAD/x86_64-centos7-gcc9-opt/lib/libCore.so (0x00007f99c1a61000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f99c1845000)
        libstdc++.so.6 => /cvmfs/sft.cern.ch/lcg/releases/gcc/9.2.0-afc57/x86_64-centos7/lib64/libstdc++.so.6 (0x00007f99c1466000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f99c1164000)
        libgcc_s.so.1 => /cvmfs/sft.cern.ch/lcg/releases/gcc/9.2.0-afc57/x86_64-centos7/lib64/libgcc_s.so.1 (0x00007f99c0f4c000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f99c0b7f000)
        /lib64/ld-linux-x86-64.so.2 (0x00007f99c27ab000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f99c097b000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f99c0773000)
        liblzma.so.5 => /lib64/liblzma.so.5 (0x00007f99c054d000)
        libz.so.1 => /lib64/libz.so.1 (0x00007f99c0337000)

This is likely the problem. What is the version of TBB in the system? If it’s less than 2018, then it doesn’t have features needed by ROOT.

I recommend trying with my nighly build of ROOT in CVMFS. Please run

/cvmfs/sft.cern.ch/lcg/contrib/gentoo/startprefix

which will run a new shell where you’ll have a root nightly build in your $PATH, then try to re-run your code to test. Cheers,

Thanks. Everything works fine now. It seems that python3.7 is not supported so I need to switch back to python3.6.9.

Hi,
glad to hear that!

About the CVMFS nightlies, turns out there is a bit of an issue – sorry about that. One workaround to make them work is to first source the last LCG release, then the nightly, and it works:

$ source /cvmfs/sft.cern.ch/lcg/views/LCG_96/x86_64-centos7-gcc8-opt/setup.sh
$ source /cvmfs/sft.cern.ch/lcg/nightlies/dev3/Sun/ROOT/HEAD/x86_64-centos7-gcc8-opt/bin/thisroot.sh

We are working on fixing this shortcoming.
Cheers,
Enrico

Python 3.7 is supported by ROOT, but ROOT has to be compiled against a specific version of Python. My ROOT nightly is compiled against Python 3.6.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.