I have a model trained using TMVA PyKeras that I need to apply within an analysis framework where tensorflow/keras are unavailable. Therefore, I’m trying to convert the model to a standard JSON format with the lwtnn keras2json converter. One of the things I need to do is make a variables.json file (templates are shown here: https://github.com/lwtnn/lwtnn/wiki/Keras-Converter) listing input variables together with parameters used for normalizing them in the training - offset and scale values. I think it should be possible to access these values using TMVA outputs (class.C and weights.xml files), but I cannot figure out how. What would be the easiest way to do this?
Hi,
The XML file should list all the variable names, their transformations and the trasformation parameters.
However, TMVA does not diretly support the simple re-scaling transformation as supported by LWTNN, x -> x' = (x+offset)*scale
However you can implement this as a simple expression when adding variables in TMVA (DataLoader::AddVariable) after having computed mean and stddev for each variable. The variable expression is also stored in the XML file
Thanks, I will try to implement the transformation at the AddVariable stage. I also took a look at the transformation parameters in the XML file, but I’m a bit confused about the contents. For example, for Transform Name=“Decorrelation” I see three different transformation matrices (I assume that’s what they are). I would expect to see only one?
Hi,
You have probably a different decorrelation matrix for every class of events (e.g. signal, background1, background2).
I would not apply decorrelation if you are using a neural network afterwards. It is better to let the network learn the correlations.