Dear all,
I’m using a custom C++ class compiled in a shared directory from a set of pyROOT scripts.
When using ROOT 6.12/06 a simple task such as looping over a TTree and read ~10 vector variables is significantly slowed down by mysterious calls to libtbb.so which I’ve never see in 6.10 and earlier.
Details : I’ve inspected the code with callgrind, by instrumenting the relevant code in the loop over events,
That is :
// The Loop over events :
for (int i=m_firstEvent;i<last;++i) {
CALLGRIND_START_INSTRUMENTATION;
m_inputReader->loadEntry(i); // this just calls TTree::GetEntry(i)
bool res = processEvent( ); // this actually does nothing within my tests
CALLGRIND_STOP_INSTRUMENTATION;
With the simplest pyROOT script invoking the above C++ code, everything looks as expected, as illustrated by the top functions counted by callgrind :
Ir
--------------------------------------------------------------------------------
27,476,138 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
8,336,750 ???:inflate_fast [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
4,860,255 ???:memcpy [/lib64/libc-2.12.so]
1,721,193 ???:adler32_z [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
1,658,000 ???:TBranchElement::GetEntry(long long, int) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libTree.so]
Unfortunately, when using the exact same code from a fully functionnal script (which involves loading other shared libraries and setting up several C++ classes) I get :
Ir
--------------------------------------------------------------------------------
53,019,087 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
8,336,750 ???:inflate_fast [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
4,858,558 ???:memcpy [/lib64/ld-2.12.so]
3,645,558 /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc62binutils/LABEL/slc6/build/externals/tbb-2018_U1/src/tbb/2018_U1/build/linux_intel64_gcc_cc6.2.0_libc2.12_kernel2.6.32_release/../../src/tbb/custom_scheduler.h:tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/tbb/2018_U1-d3621/x86_64-slc6-gcc62-opt/lib/libtbb.so.2]
3,384,000 ???:tbb::interface9::internal::start_for<tbb::blocked_range<unsigned int>, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libImt.so]
2,108,963 /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc62binutils/LABEL/slc6/build/externals/tbb-2018_U1/src/tbb/2018_U1/build/linux_intel64_gcc_cc6.2.0_libc2.12_kernel2.6.32_release/../../src/tbb/scheduler.cpp:tbb::internal::generic_scheduler::allocate_task(unsigned long, tbb::task*, tbb::task_group_context*) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/tbb/2018_U1-d3621/x86_64-slc6-gcc62-opt/lib/libtbb.so.2]
2,006,000 ???:TTree::GetEntry(long long, int)::{lambda() [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libTree.so]
1,838,000 ???:TBranchElement::GetEntry(long long, int) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libTree.so]
1,837,674 ???:pthread_getspecific [/lib64/libpthread-2.12.so]
1,721,193 ???:adler32_z [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
the instruction count is now almost twice and includes many expensive calls from libtbb …
Can anyone give a hint as what is happening with these libtbb calls ? Is it possible to get rid of them ? Unfortunately my full pyROOT script is rather complex so any help on where to search would be appreciated !
_ROOT Version:6.12/06
_Platform: linux
Compiler: Not Provided