TMVA application

Dear experts,

  • I run TMVA classification and here (1) you can see that the bkg pick 0.1 with a distribution btw around [0, 0.8]
  • Then I run TMVA application on these bkg sample and fill the DNN variable. You can see (2) that the bkg pick at 0.8 and starts at 0.78.
  • (1) is normalize but shoudn’t I have the same kind of distribution then in (1) and (2)? At least (2) should be more shiftted towards very low values right?
  • To get the DNN response for each evt I used “dnn_response = reader->EvaluateMVA(“DNN”);”
    Regards

(1)
image

(2)

Dear experts,
any idea?
Regards

Hi,

Which ROOT version are you using ? I could not reproduce this with the latest one. Also, which architecture are you using for the DNN ? The CPU or the GPU ones ?

Lorenzo

Dear Lorenzo,

  • I run with GPU with root version 6.15
  • for the classification I used the file (1)
  • as input files I used (2)
  • for the application I used (3)
  • do you see what is wrong?
    Regards

(1)
http://calpas.web.cern.ch/calpas/TMVAClassification.C

(2)
signal
http://calpas.web.cern.ch/calpas/tth_dilep_mc16a.root
http://calpas.web.cern.ch/calpas/tth_dilep_mc16c.root
background
http://calpas.web.cern.ch/calpas/ttw_mc16a.root
http://calpas.web.cern.ch/calpas/ttw_mc16c.root

(3)
http://calpas.web.cern.ch/calpas/TMVAClassificationApplication.C

Hi,

Thank you for posting the files. I will investigate it and let you know

Dear Moneta,
did you saw what was wrong?
Regards

Hi,

I could reproduce your problem with your code, but I could not find the cause.
Using different data set works fine.
It cannot be the reading of the model, because his is tested in the first macro when calling Factory::EvaluateAllMethods.
I will continue investigating and I will let you know

Lorenzo

Hi,

I have found that in your Reader application you are not using the same inputs variables as when you run the train. You are applying a log() to several of the input features, but in TMVApplication you don’t use the log.
For example you should do:

r_lep_Pt_0      = log(lep_Pt_0);

As shown in the tutorials tmva/TMVAClassificationApplication.C you need to apply the expression to the variables.

You need to be sure to have the same input type of data. I would suggest you to check that.
Also it is not clear to me what you are doing exactly with the variables jet_PT. It seems in some files some array elements are not defined. You need to have the same inputs for all files, otherwise this can cause problems

Best Regards

Lorenzo

1 Like

Dear Lorenzo,

  • thank you for debugging.
  • please notice that contrarily to what is indicate in the reader methods, when I parse an “int” variable to the reader, I have this error message (1). So I put float every where.
  • I set the log variables as you asked, but the evaluation “reader->EvaluateMVA(“DNN”);” gives always the same value “0.836195”. Do you see why? Is tha the good way to evaluate the DNN?
  • I copied the modified code here: http://calpas.web.cern.ch/calpas/TMVAClassificationApplication.C
    Regards

(1)

Dear experts,
any idea?
Regards

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Sorry for the late reply! Are you still having this issue ?
Getting a n output of a DNN which is always the same means that the activation functions are saturated. This can happen when the input variable distributions are very different from what has been used for training.
You need to be sure that these distributions are similar

Lorenzo