Optimizer in TMVA::DNN?

physmlee · January 28, 2020, 1:14pm

Hi, everyone?

I’m practicing how to use DNN in TMVA.
I’d like to use adam optimizer for my network, but I can’t find where and how should I set it.
I can only find the way through pymva.
I can find ‘Adam.h’, ‘Optimizer.h’, … headers in $ROOT directory, so I think there should be a way to set the optimizer in C++ TMVA.

Where and how can I apply an optimizer on my DNN method?

jblomer · January 28, 2020, 4:21pm

@moneta perhaps you can help?

couet · January 28, 2020, 4:25pm

Or may be @kialbert ?

moneta · January 28, 2020, 4:38pm

Hi,

If you are using the kDL method (as in TMVAClassification.C) ADAM is the default optimiser. You can change it by using the Optimiser option string. (e.g. Optimizer=ADAGRAD ) will use the Adagrad optimizer

Lorenzo

physmlee · January 29, 2020, 5:22am

Actually, I was using kDNN, but it seems better to use kDL.
Anyway, would there be any disadvantage to use kDL rather than kDNN?
Additionally, I can not find SOFTMAX activation function in kDL, which exists in kDNN. I want to use softmax function at the output layer, and then to use (categorical_)crossentropy loss function.
How should I do?

moneta · January 29, 2020, 9:06am

Hi,

Yes it is better to use kDL which is the new implementation supporting different types of layers.
When you define for the network a cross-entropy loss function the softmax is applied automatically before (i.e it is included in the loss function calculation). See for example
https://root.cern.ch/doc/master/Cpu_2LossFunctions_8hxx_source.html#l00077

To avoid applying 2 times the Softmax function, you need just to set to LINEAR the activation for the last layer, as it is done in the TMVAClassification.C example macro:

github.com

root-project/root/blob/master/tutorials/tmva/TMVAClassification.C#L450


if (Use["MLPBFGS"])
   factory->BookMethod( dataloader, TMVA::Types::kMLP, "MLPBFGS", "H:!V:NeuronType=tanh:VarTransform=N:NCycles=600:HiddenLayers=N+5:TestRate=5:TrainingMethod=BFGS:!UseRegulator" );


if (Use["MLPBNN"])
   factory->BookMethod( dataloader, TMVA::Types::kMLP, "MLPBNN", "H:!V:NeuronType=tanh:VarTransform=N:NCycles=60:HiddenLayers=N+5:TestRate=5:TrainingMethod=BFGS:UseRegulator" ); // BFGS training with bayesian regulators




// Multi-architecture DNN implementation.
if (Use["DNN_CPU"] or Use["DNN_GPU"]) {
   // General layout.
   TString layoutString ("Layout=TANH|128,TANH|128,TANH|128,LINEAR");


   // Training strategies.
   TString training0("LearningRate=1e-2,Momentum=0.9,Repetitions=1,"
                     "ConvergenceSteps=30,BatchSize=256,TestRepetitions=10,"
                     "WeightDecay=1e-4,Regularization=None,"
                     "DropConfig=0.0+0.5+0.5+0.5, Multithreading=True");
   TString training1("LearningRate=1e-2,Momentum=0.9,Repetitions=1,"
                     "ConvergenceSteps=20,BatchSize=256,TestRepetitions=10,"
                     "WeightDecay=1e-4,Regularization=L2,"
                     "DropConfig=0.0+0.0+0.0+0.0, Multithreading=True");

Lorenzo