Hello experts, I hope you can help me pls
I am using the basic TMVA script (https://root.cern/doc/master/ClassificationKeras_8py.html) from the tutorial to test the NN training on my own root files.
The script runs well with the example, however when I use my own root file as input I get the following errors:
: Expression: MyBranch1 does not provide data for this event. This event is not taken into account. --> please check if you use as a variable an entry of an array which is not filled for some events (e.g. arr when arr has only 3 elements).
: If you want to take the event into account you can do something like: “Alt$(arr,0)” where in cases where arr doesn’t have a 4th element, 0 is taken as an alternative.
Error in TTreeFormula::Compile: Bad numerical expression : “MyBranch1”
: Expression MyBranch1 could not be resolved to a valid formula.
***> abort program execution
Traceback (most recent call last):
File “./ClassificationKeras.py”, line 63, in
Exception: void TMVA::TrainAllMethods() =>
FATAL error (C++ exception of type runtime_error)
This error Warning/Error is repeated for all my branches
I am clueless in these issues but I guess this is about the fact that my branches are of type vector while the root example file used in the tutorial has F type of branches?
I see that the error suggest assigning the value 0 in cases it is missing input, but I am not sure I would want that, and also I am not sure where in the script I should put it
If anyone has an idea it will be super helpful… preferably for dummies pls
Many thanks in advance
Could you please post a small reproducer here or maybe share a file?
@moneta sorry for pinging you but maybe you will be able to take a look? Thanks!
If you have a vector you can provide as input to TMVA each single vector element, (supposing
MyBranch is the name of the std::vector branch) using the function
data loader.AddVariable("MyBranch") and similar for
Otherwise you have also the possibillity to provide as input the entire vector, by using
data loader.AddVariableArray("MyBranch", n)
where n its the size of the vector.
This applies if the vector has the size for each event. If this is not the case you would need to add zero values for the missing elements
Many thanks both oshadura and Lorenzo! And apologies for the delay. I was hoping to finish implementing Lorenzo’s solution and testing this before replying but I got a bit confused on something very silly probably.
The Classification script in https://root.cern/doc/master/ClassificationKeras_8py.html seems to be fixed on 4 flat variables. So I now noticed it complains first that I use vectors and second that I have more than 4 in my tree. So if for example I created a flat root file with 5 variables it complains:
Exception: Error when checking model input: expected dense_input_2 to have shape (None, 4) but got array with shape (200, 5)
I see that this script calls header files (import ROOT) where it is defined in the data loaders the variables.
Lorenzo, if I understand you correctly I should add these lines in the .py file, but then how do I ran-over the definitions in the header that calls these 4 variables of the tutorial root file?
The solution seems to me to edit all headers, but in case there is a shortcut, it will be good to know.
Many thanks again and apologies for the cluelessness
If you change the number of inputs (following the tutorial https://root.cern/doc/master/ClassificationKeras_8py.html , you need to change the input shape of the Keras model. For example change in this line from 4 to 5 (if you have 5 input variables):
model.add(Dense(64, activation='relu', W_regularizer=l2(1e-5), input_dim=4))