How to train a BDT with ROOT::EnableImplicitMT() on LXPLUS

Dear Experts,

I have read that ROOT has the functionality to parallelise GradBoost BDT training in TMVA via using the ROOT::EnableImplicitMT().

I am trying to do this on LXPLUS (also tried it on HTCondor with setting RequestCpus along) with ROOT (I tried 6.20.06-x86_64-centos7-gcc8-opt and 6.22.06-x86_64-centos7-gcc8-opt) by adding at the top of my code the ROOT::EnableImplicitMT(4), to request 4 threads. Based on some other posts I read here, I also added ROOT::GetImplicitMTPoolSize() command to see if the command is actually received, which gives a positive result. The program runs without errors.

However, despite large file size (several millions of training events with around 25 branches each) I don’t see any notable difference in execution times. According to performance studies in this paper, BDT training with 4-cores should perform about twice faster than single-core case.

I failed to find a TMVA BDT training (not NN) example that demonstrates how this functionality works. All replies point to a simple inclusion of the command ROOT::EnableImplicitMT(n) into the code. Could you kindly provide a minimal setup that shows this functionality applied to TMVA BDT training?

Cheers,

Ogul

May be @moneta has an idea.

Hi,
Yes, this should be enough to enable to run it with multi-thread. Can you check , for example using top if you are actually running in multiple threads or a single thread mode ? You can see if CPU utilisation is more than 100%,

Cheers

Lorenzo

Hi,

I have the same problem with DNN, too. I would like to train DNN with multi-threading but it showed the number of threads is =1. I tried with ROOT v6.22.06 on HTCondor.

I use the example in $ROOTSYS/tutorials.

Best,
Jie

Hi,
What is printing if you are doing from the ROOT prompt ?

root [1] ROOT::EnableImplicitMT(0)
root [2] ROOT::GetThreadPoolSize()
(unsigned int) 16