I’m trying to use the TMVA DNN method as classificator. I have no problem using the old good TANH activation function but nearly everytime I try using the RELU activator I get ‘nan’ both as Test Err. and Train Err. after a few epochs.
When using CROSSENTROPY as error strategy I presume this is caused by RELU returning 0 or 1, both values that cannot be used as argument for the logarithms inside the cross entropy calculation.
But I tried using SUMOFSQUARES and the problem remains. In fact, I cannot train any configuration DNN with RELU without getting nan as error.
Someone know a stable workaround for using RELU in TMVA?
Can you please post a macro and a minimal data set reproducing this problem? I need to reproduce it and see if it is a problem in the configuration
here is my macro
and here is a part of my dataset
I am sorry for my later reply. Th problem is that you can’t use a RELU for the last, output layer. You need a tan or a sigmoid function.
This is the problem causing the NaN. We need to add a warning in TMVA for this.
If you still have problems after changing this, please let me know
I don’t think that is the problem since I never use anything but LINEAR for the last layer. For example my working layout is
“Layout=TANH|(N * 3),TANH|(N * 2),TANH|(N),TANH|(N),TANH|(N-10),LINEAR”
while the following configuration does not work does not work
“Layout=RELU|(N * 3),RELU|(N * 2),RELU|(N),RELU|(N),TANH|(N-10),LINEAR”
I try that configuration(both) and it works,
which version of ROOT and which OS are using?
I’m using ROOT 6.10/08* on Ubuntu 16.04 LTS
*with this(**) hotfix
I get this when I try to run the expample I gave you