EvaluateMulticlass method not consistent with training output (repeating floating point values)

Dear tmva users,

I have trained deep neural network on a multi-classification problem using Keras’ (tensorflow backend) TMVA interface. The training/testing was all successful and the output distributions of the input variables and individual output node responses all seem reasonable.

I have then written a simple script in order to evaluate the network on the same data set as was used in the training as a sanity check. I was expecting that, for a given sample of events, I would be able to reproduce the response of the network output from the training however this is not the case. Not only do I see different values but I also noticed many events with identical floating point numerical values for the evaluated network response, even though the values of the input values seem to be changing.

I have attached a simplified working version of the network application script that saves an output histogram of the responses for the signal node. I have also attached the input .root file and the weights.xml I am using. Any guidance would be very much appreciated and please let me know if you need any more information.

Cheers,

Josh

Hi,

There is a problem downloading the files from the link you have provided.
Can you please fix them ?
The scripts you can attached directly, the data file is maybe better providing as an external link, given the large size

Thank you

Lorenzo

Hi Lorenzo,

You should now be able to get all the files from the dropbox link I have added.

Cheers,

Josh

Hi Josh

I could download your file. However the XML file is not enough for PyKeras. The XML contains just a link to the real trained model , which should be in
MultiClass_DNN/weights/TrainedModel_DNN.h5

Can you please upload also this file

Thank you

Lorenzo

Hi,

Terribly sorry, I’ve now updated the repo to include the model.

Cheers,

Josh

Hi,

After some changes in your macro (e.g. different name of input file), I could run your Python application based on the TMVA::Reader.
It is true , I could see some floating values repeating. I will investigate the reason.
Maybe I would need the help of @swunsch author of PyKeras, when he will be back next week.

Cheers

Lorenzo

Hi,

Has there been any progress on this @swunsch @moneta ?

Cheers,

Josh

Hi,

Sry for the late reply. I’ve just examined the problem. First I’ve checked that always different inputs lead to the discretized outputs. Therefore, either it is a problem of the preprocessing or the Keras wrapping.

As far as I’ve seen, you’ve applied the variable transforms D and G, which are highly nonlinear. Could you check in the TMVA training output that the output distributions of the preprocessing are not discrete such as the output of Keras?

Cheers
Stefan

Hi Stefan,

@swunsch I have added the output .root from the training to the dropbox repository linked above. I couldn’t see anything unusual.

Cheers,

Josh

Hi @swunsch @moneta ,

Has there been any progress with this?

Cheers,
Josh

@swunsch @moneta

Any pogress?

Hi,

Sorry to keep bothering you about this @swunsch but I could really do with some further information on this.

Cheers,

Josh

I’ve investigated further with the examples in root/tutorials/tmva/keras but I couldn’t reproduce such a behaviour. It does not seem to be any combination of the preprocessing so actually it has to be sth with the Keras part itself. Now, I’ll take Pykeras apart, probably I’ll find something.

It seems that you’ve removed the application script form the dropbox folder. Can you reupload it? And can you attach the training script as well?

Hi @swunsch,

Sorry I’m not sure how that happened. I’ve re-uploaded the files.

Thanks again for looking into this,

Josh

I haven’t seen the TMVA output in the dropbox folder, but you said that the evaluation does not show screwed up distributions of the transformations. However, I’ve replaced your trained model with this:

model = Sequential()
model.add(Dense(3, activation="linear", input_shape=(8,)))
model.compile(loss="categorical_crossentropy", optimizer="adam")
model.save("test.h5")

This model should behave linear but still I see spikes in the histogram of the response. I see no different explanation that the input transformation behaves strangely.

As mentioned before, have you tried to train without any variable transformation? This could help greatly to debug the problem.

Hi,

Thanks for performing this check. The input variables and transformation plots from the training should be there again.

I’ve tried changing the VarTransform=None in the training. I then use the same application script as before
and print out the following value:

print 'MVA vals : ', reader.EvaluateMulticlass('DNN')[0]

and I see the following error:

MVA vals : <ERROR>                         : 1-th variable of the event is NaN,
<ERROR>                         :  regression values might evaluate to .. what do I know.
<ERROR>                         :  sorry this warning is all I can do, please fix or remove this event.

This error did not appear when I use VarTransform=D,G. I also tried using VarTransform=Norm and just VarTransform=D and the NaN error appears for this as well. Which variable it says produces an ‘NaN’ changes from event to event.

I’ve tried checking for NaN values in the event loop but found none and I’ve also checked both the non-transformed and the transformed variable distributions and they didn’t look odd.

Cheers,

Josh

Alright, so it seems that it works only by chance with the transformation D,G and your setup produces an invalid transformation, which causes all the problems. Is it possible for you to provide the training scripts (and data) as well?

Actually your error is very strange because the factory destroys all methods and recreates them from the configs during the testing and evaluation, which produce the plots in the TMVA ROOT output.

I have provided the training and updated application scripts in the Dropbox area along with samples. At the top of the scripts are example command lines used to run the scripts. The environment I setup from here:

Hi @swunsch,

Did you manage to get the training code working? I realise you are replying rapidly to me on another thread for which I am grateful.

Cheers again,

Josh