pyROOT : code greatly slowed down by calls from libtbb.so & multithreading related stuff

Dear all,

I’m using a custom C++ class compiled in a shared directory from a set of pyROOT scripts.
When using ROOT 6.12/06 a simple task such as looping over a TTree and read ~10 vector variables is significantly slowed down by mysterious calls to libtbb.so which I’ve never see in 6.10 and earlier.

Details : I’ve inspected the code with callgrind, by instrumenting the relevant code in the loop over events,
That is :

   // The Loop over events :
    for (int i=m_firstEvent;i<last;++i) {
      CALLGRIND_START_INSTRUMENTATION;
      m_inputReader->loadEntry(i); // this just calls TTree::GetEntry(i)
      bool res = processEvent( ); // this actually does nothing within my tests
      CALLGRIND_STOP_INSTRUMENTATION;

With the simplest pyROOT script invoking the above C++ code, everything looks as expected, as illustrated by the top functions counted by callgrind :

        Ir 
--------------------------------------------------------------------------------
27,476,138  PROGRAM TOTALS

--------------------------------------------------------------------------------
       Ir  file:function
--------------------------------------------------------------------------------
8,336,750  ???:inflate_fast [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
4,860,255  ???:memcpy [/lib64/libc-2.12.so]
1,721,193  ???:adler32_z [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
1,658,000  ???:TBranchElement::GetEntry(long long, int) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libTree.so]

Unfortunately, when using the exact same code from a fully functionnal script (which involves loading other shared libraries and setting up several C++ classes) I get :

         Ir 
--------------------------------------------------------------------------------
53,019,087  PROGRAM TOTALS

--------------------------------------------------------------------------------
       Ir  file:function
--------------------------------------------------------------------------------
8,336,750  ???:inflate_fast [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]
4,858,558  ???:memcpy [/lib64/ld-2.12.so]
3,645,558  /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc62binutils/LABEL/slc6/build/externals/tbb-2018_U1/src/tbb/2018_U1/build/linux_intel64_gcc_cc6.2.0_libc2.12_kernel2.6.32_release/../../src/tbb/custom_scheduler.h:tbb::internal::custom_scheduler<tbb::internal::IntelSchedulerTraits>::local_wait_for_all(tbb::task&, tbb::task*) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/tbb/2018_U1-d3621/x86_64-slc6-gcc62-opt/lib/libtbb.so.2]
3,384,000  ???:tbb::interface9::internal::start_for<tbb::blocked_range<unsigned int>, tbb::internal::parallel_for_body<std::function<void (unsigned int)>, unsigned int>, tbb::auto_partitioner const>::execute() [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libImt.so]
2,108,963  /mnt/build/jenkins/workspace/lcg_release_tar/BUILDTYPE/Release/COMPILER/gcc62binutils/LABEL/slc6/build/externals/tbb-2018_U1/src/tbb/2018_U1/build/linux_intel64_gcc_cc6.2.0_libc2.12_kernel2.6.32_release/../../src/tbb/scheduler.cpp:tbb::internal::generic_scheduler::allocate_task(unsigned long, tbb::task*, tbb::task_group_context*) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/tbb/2018_U1-d3621/x86_64-slc6-gcc62-opt/lib/libtbb.so.2]
2,006,000  ???:TTree::GetEntry(long long, int)::{lambda() [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libTree.so]
1,838,000  ???:TBranchElement::GetEntry(long long, int) [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/ROOT/6.12.06-0f687/x86_64-slc6-gcc62-opt/lib/libTree.so]
1,837,674  ???:pthread_getspecific [/lib64/libpthread-2.12.so]
1,721,193  ???:adler32_z [/cvmfs/atlas.cern.ch/repo/sw/software/21.2/sw/lcg/releases/zlib/1.2.11-da225/x86_64-slc6-gcc62-opt/lib/libz.so.1.2.11]

the instruction count is now almost twice and includes many expensive calls from libtbb …

Can anyone give a hint as what is happening with these libtbb calls ? Is it possible to get rid of them ? Unfortunately my full pyROOT script is rather complex so any help on where to search would be appreciated !


_ROOT Version:6.12/06
_Platform: linux
Compiler: Not Provided


Hi,

are you activating implicit multi threading? Can we reproduce this behaviour in a standalone program?

Cheers,
D

Hello,

I guess you mean calling ROOT::EnableImplicitMT ?
I’m not doing this call. Can it be mt be actived otherwise ?

I’ve tested a simple executable using the same C++ object, but this does not trigger the libtbb calls…

I’m trying to re-add one by one all the ingredients of my full script, but it can take some time :frowning:

Hi,

without that call, the library has no effect. It would be great to have a reproducer to figure out what exactly it’s going on.

Danilo

Hello,

I think I now understand.
My full script is using the argparse module to read command line arguments. It is defining the ‘-t’ option and when the script is invoked with this option, then it appears in sys.argv .
Now it turns out that import ROOT does also reacts to what’s in sys.argv and interpret the options the same as when given to the root command.
Since -t means enabling multithreading I think that explains the appearance of calls to libtbb . As expected there are no such calls when omitting -t

So the solution was to manually remove the -t entry from sys.argv before the import ROOT statement.

Finally I’m surprised by this behaviour. Is it something common in python that the result of importing a module depends on what’s in sys.argv ?

Dear @pad

Can you try by setting:

ROOT.PyConfig.IgnoreCommandLineOptions = True 

This should make PyROOT ignore the command line args and you will not need to remove the -t yourself.

Cheers,
Enric

Hello Enric,

Thanks for your suggestion but I must be missing something. Doing as follows doesn’t help :

>>> sys.argv[-1] # -t was passed at the command line
'-t'
>>> import ROOT
>>> ROOT.gInterpreter.ProcessLine("ROOT::IsImplicitMTEnabled()")
(bool) true
1L
>>> ROOT.PyConfig.IgnoreCommandLineOptions = True 
>>> ROOT.gInterpreter.ProcessLine("ROOT::IsImplicitMTEnabled()")
(bool) true
 # --> multithreading still enabled

Since multithreading is enabled at import time, how would setting an option after the import help ?

Ah ! This works :

>>> sys.argv[-1]
'-t'
>>> from ROOT import PyConfig
>>> PyConfig.IgnoreCommandLineOptions = True 
>>> import ROOT
>>> ROOT.gInterpreter.ProcessLine("ROOT::IsImplicitMTEnabled()")
(bool) false
0L

Thanks !
But again I do find this way of working quite counter-intuitive : why would import ROOT depends on sys.argv ? And it sounds strange that to disable this I have to import one specific item PyConfig from within the actual ROOT namespace…
(I’m happy to submit a bug/feature ticket if that’s something which can be discussed)

P-A

Hi @pad
It is not a bug, it is a feature :slight_smile: PyROOT does some treatment of the command line arguments by default and hands them over to a TApplication. This behaviour can be switched off as you saw.
We are now in the process of creating a new and modern PyROOT and this is one of the points we need to discuss whether to keep it or not. But for now, please use that option and let us know if you have further problems!
Enric

Hi Enric,

Thanks for your answer ! I understand this cannot be changed easily for a next release.
But if this is going to be in discussion, please consider the case raised in this discussion.
In the current situation

  • import ROOT has side effects which occur when a python program is passed specific arguments (I don’t think a pyroot program writer should pay attention to have her program arguments not be in conflict with the root command line arguments)
  • the way to avoid these side effects is rather convoluted
  • in my particular case using -t in my script silently enabled implicit multithreading in a single thread program resulting in a ~2x slower execution…

P-A

Hi @pad

Thank you for your valuable feedback. Indeed your issues with the arguments have been reported before by other users. One obvious solution would be to turn the arg parsing off by default, and leave the option to activate it. Of course this would break backwards compatibility with existing scripts, so we need to figure out when and how (and if) we do this.

Enric

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.