Does TMVA support multi-core or cluster mode?

Dear Expert:

Running TMVA DNN costs too long time, just wondering if TMVA supports multi-core or cluster mode?

Cheers, Gang

Hi,

There is a multi-threaded implementation of the DNN available using the configuration option "Architechture=CPU". If you have a GPU available you can also run the DNN using it. This should be the fastest option (use "Architecture=GPU").

Running the DNN on a cluster is unfortunately not possible currently.

Cheers,
Kim

Hi, Kim:

I am using DNN_CPU which has “Multithreading=True” as default for each layer, but when I run the training, top shows that it’s still 1 core being used, although the “%CPU” could reach 130%. So TMVA runs multi-threads on only 1 core?

Cheers, Gang

Using multithreading should utilise all available cores for the computationally expensive operations.

What is your configuration? If your network/input data is too small the serial part of the calculation will dominate. You can try increasing the batch size and see if you see an improvement :slight_smile:

Cheers,
Kim

It still use only 200% of a single core while there are 8 cores on that machine, here is the configuration:

Use[“DNN_CPU”] = 1; // Multi-core accelerated DNN.
dataloader->AddVariable( “var1 := L_RRC_ConnReq_Att - L_RRC_ConnReq_Succ”, ‘F’ );
dataloader->AddVariable( “var2 := L_RRC_ConnReq_Att + L_RRC_ConnReq_Succ”, ‘F’ );
dataloader->AddVariable( “var3 := L_RRC_ConnReq_Att * L_RRC_ConnReq_Succ”, ‘F’ );
dataloader->AddVariable( “var4 := L_RRC_ConnReq_Att / L_RRC_ConnReq_Succ”, ‘F’ );

dataloader->PrepareTrainingAndTestTree( mycuts, mycutb,
“nTrain_Signal=10000:nTrain_Background=10000:SplitMode=Random:NormMode=NumEvents:!V” );

if (Use[“DNN_CPU”] or Use[“DNN_GPU”]) {
// General layout.
TString layoutString (“Layout=TANH|128,TANH|128,TANH|128,LINEAR”);
//TString layoutString (“Layout=TANH|128,TANH|128,TANH|128,LINEAR”);

  // Training strategies.
  TString training0("LearningRate=1e-1,Momentum=0.9,Repetitions=1,"
                    "ConvergenceSteps=20,BatchSize=2560,TestRepetitions=10,"
                    "WeightDecay=1e-4,Regularization=L2,"
                    "DropConfig=0.0+0.5+0.5+0.5, Multithreading=True");
  TString training1("LearningRate=1e-2,Momentum=0.9,Repetitions=1,"
                    "ConvergenceSteps=20,BatchSize=2560,TestRepetitions=10,"
                    "WeightDecay=1e-4,Regularization=L2,"
                    "DropConfig=0.0+0.0+0.0+0.0, Multithreading=True");
  TString training2("LearningRate=1e-3,Momentum=0.0,Repetitions=1,"
                    "ConvergenceSteps=20,BatchSize=2560,TestRepetitions=10,"
                    "WeightDecay=1e-4,Regularization=L2,"
                    "DropConfig=0.0+0.0+0.0+0.0, Multithreading=True");
  TString trainingStrategyString ("TrainingStrategy=");
  trainingStrategyString += training0 + "|" + training1 + "|" + training2;

Any idea?

Cheers, Gang

Sorry I haven’t been able to get back to you yet. Will do asap.

Cheers,
Kim