[DNN] Train/Test Error = nan when using ReLU

Hi,
I’m trying to use the TMVA DNN method as classificator. I have no problem using the old good TANH activation function but nearly everytime I try using the RELU activator I get ‘nan’ both as Test Err. and Train Err. after a few epochs.

When using CROSSENTROPY as error strategy I presume this is caused by RELU returning 0 or 1, both values that cannot be used as argument for the logarithms inside the cross entropy calculation.
But I tried using SUMOFSQUARES and the problem remains. In fact, I cannot train any configuration DNN with RELU without getting nan as error.

Someone know a stable workaround for using RELU in TMVA?

Hi,

Can you please post a macro and a minimal data set reproducing this problem? I need to reproduce it and see if it is a problem in the configuration

Thank you

Lorenzo

Sure,

here is my macro

and here is a part of my dataset

Alberto

Hi,

I am sorry for my later reply. Th problem is that you can’t use a RELU for the last, output layer. You need a tan or a sigmoid function.
This is the problem causing the NaN. We need to add a warning in TMVA for this.
If you still have problems after changing this, please let me know

Best Regards

Lorenzo

Hi Lorenzo,
I don’t think that is the problem since I never use anything but LINEAR for the last layer. For example my working layout is

“Layout=TANH|(N * 3),TANH|(N * 2),TANH|(N),TANH|(N),TANH|(N-10),LINEAR”

while the following configuration does not work does not work

“Layout=RELU|(N * 3),RELU|(N * 2),RELU|(N),RELU|(N),TANH|(N-10),LINEAR”

Hi

I try that configuration(both) and it works,
which version of ROOT and which OS are using?
Cheers,
Omar.

Hi Omar,
I’m using ROOT 6.10/08* on Ubuntu 16.04 LTS

*with this(**) hotfix

(**) https://root-forum.cern.ch/t/trouble-compiling-root-with-cuda-support-for-tmva/25900/7

I get this when I try to run the expample I gave you