PyKeras with event-by-event weights

Archil · April 13, 2019, 12:18pm

Hello,

Can you please tell me how I can use event-by-event weights in the PyKeras training?

I found the following example of PyKeras usage, but it use trees without event weights:
https://nbviewer.jupyter.org/github/iml-wg/tmvatutorials/blob/master/TMVA_PyMVA.ipynbachine_learning/TMVA_PyMVA.ipynb

Thanks,
Archil

oshadura · April 14, 2019, 6:18am

@kialbert, @moneta please maybe you can help here?

kialbert · April 16, 2019, 4:37pm

Hi,

That should be the same as in “normal” TMVA training. Please see the TMVAClassification/Regression tutorial here.

In short:

dataloader = TMVA.DataLoader("dataset_name")
dataloader.SetWeightExpression("branch_name")

# or if you have different weights for sig/bkg
dataloader.SetSignalWeightExpression("branch_name")
dataloader.SetBackgroundWeightExpression("branch_name")

Cheers,
Kim

Archil · April 16, 2019, 7:40pm

Hello,

Thanks a lot for the reply,

I have tried that method and it works perfectly well for other TMVA methods, but for PyKeras method I am getting the discrepancy between the PyKeras variable from the TrainTree and the PyKeras distribution from the histogram stored in the output root file.

If I do not use the event weights then the PyKeras variable from the TrainTree is consistent with the histogram in the output root file.
So, I thought there was other method to treat event-by-event weights for PyKeras.

Regards,
Archil

Archil · April 17, 2019, 5:45pm

Hello,

Each time I run the training with unchanged code I am getting the different results:
in most cases PyKeras output distribution for signal obtained from the TrainTree differs from the “MVA_PyKeras_Train_S” histogram in the output root file. Only few times, histogram obtained from the TrainTree matches to the “MVA_PyKeras_Train_S” histogram, as it should always be.

If I do not use event-by-event weights then everything is OK.

Regards,
Archil

Archil · April 22, 2019, 10:08am

It seems that the TrainTree contains results from the last epoch, while the histogram in the output root file contains results from the epoch with smallest validation loss.

If I set the parameter SaveBestOnly=false, then TrainTree and histograms in the root file have same results.

So, my problem was not in the usage of event-by-event weights.

kialbert · April 23, 2019, 11:29am

Thank you for clarifying!