Converting TMVA PyKeras output with lwtnn - variable normalization details

Hi TMVA Experts,

I have a model trained using TMVA PyKeras that I need to apply within an analysis framework where tensorflow/keras are unavailable. Therefore, I’m trying to convert the model to a standard JSON format with the lwtnn keras2json converter. One of the things I need to do is make a variables.json file (templates are shown here: https://github.com/lwtnn/lwtnn/wiki/Keras-Converter) listing input variables together with parameters used for normalizing them in the training - offset and scale values. I think it should be possible to access these values using TMVA outputs (class.C and weights.xml files), but I cannot figure out how. What would be the easiest way to do this?

Cheers,
Janina

@moneta will help you as soon as we are back from vacation

1 Like

Hi,
The XML file should list all the variable names, their transformations and the trasformation parameters.
However, TMVA does not diretly support the simple re-scaling transformation as supported by LWTNN,
x -> x' = (x+offset)*scale

However you can implement this as a simple expression when adding variables in TMVA (DataLoader::AddVariable) after having computed mean and stddev for each variable. The variable expression is also stored in the XML file

Best regards

Lorenzo

Hi Lorenzo,

Thanks, I will try to implement the transformation at the AddVariable stage. I also took a look at the transformation parameters in the XML file, but I’m a bit confused about the contents. For example, for Transform Name=“Decorrelation” I see three different transformation matrices (I assume that’s what they are). I would expect to see only one?

Cheers,
Janina

Hi,
You have probably a different decorrelation matrix for every class of events (e.g. signal, background1, background2).
I would not apply decorrelation if you are using a neural network afterwards. It is better to let the network learn the correlations.

Lorenzo

Ok, got it!