ROOT Version: v6-28-02-5
Platform: Ubuntu 22.04.2 LTS
Compiler: (GCC) 12.2.0
Dear experts. I have a small question to ask. I’m using TMVA::kDL through Python. And in the neural network training setting I use dropout regularization. In the TMVA manual it’s written “Dropout is a regularization technique that with a certain probability sets neuron activations to zero. This probability can be set for all layers at once by giving a single floating point value or a value for each hidden layer of the network separated by ’+’ signs. Probabilities should be given by a value in the interval [0, 1].”
I use the code
factory.BookMethod(dataloader, ROOT.TMVA.Types.kDL, "DNN1", "!H:V:ErrorStrategy=CROSSENTROPY:VarTransform=N:WeightInitialization=XAVIER: Layout=DENSE|256|TANH,DENSE|256|TANH,DENSE|256|TANH,DENSE|256|TANH,LINEAR:TrainingStrategy=LearningRate=1e3,ConvergenceSteps=30,BatchSize=2048,TestRepetitions=1,MaxEpochs=50,Optimizer=ADAM,DropConfig=0.0++0.3+0.3+0.3")
to book neural network method and when the network starts training part of the output I have in the terminal is
DEEP NEURAL NETWORK: Depth = 5 Input = ( 1, 1, 20 ) Batch size = 2048 Loss function = C
Layer 0 DENSE Layer: ( Input = 20 , Width = 256 ) Output = ( 1 , 2048 , 256 ) Activation Function = Tanh
Layer 1 DENSE Layer: ( Input = 256 , Width = 256 ) Output = ( 1 , 2048 , 256 ) Activation Function = Tanh Dropout prob. = 0.7
Layer 2 DENSE Layer: ( Input = 256 , Width = 256 ) Output = ( 1 , 2048 , 256 ) Activation Function = Tanh Dropout prob. = 0.7
Layer 3 DENSE Layer: ( Input = 256 , Width = 256 ) Output = ( 1 , 2048 , 256 ) Activation Function = Tanh Dropout prob. = 0.7
Layer 4 DENSE Layer: ( Input = 256 , Width = 1 ) Output = ( 1 , 2048 , 1 ) Activation Function = Identity
Here it states that Dropout probability is 0.7 though in network config I’ve put 0.3 for all the layers except the first one. So the question is how should I interprent it? Is the probability that the radrom neuron will be deactivated during training is 0.3 or 0.7? I’m a bit confused about the output.