TMVA Tutorial Fatal error

Hi Root Forum,

While trying out the TMVA tutorial : CNN classification for Python, I keep getting the same fatal error. I opted for using keras.
My python version is Python 3.6.16
root-config --python-version: 3.6.9

I had to adjust the code of the tutorial slightly for it to run, the “theOption” arguments in the factory.bookmethod TMVA.Factory() and loader.PrepareTrainingAndTestTree() had to be adjusted slightly. The problem is that a fatal error occurs for the bookmethod, and some warning occur at the start of the program but I am not sure if that is related. The error is not really insightful so if there is anybody who can help me that would me much appreciated.

Here Is my code:
tutorial.py (13.9 KB)

Warnings:

WARNING:tensorflow:From /opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/keras/initializers.py:94: calling TruncatedNormal.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING:tensorflow:From /opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/site-packages/tensorflow_core/python/ops/nn_impl.py:183: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
2022-11-24 02:00:38.306121: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA

Fatal ERROR:

<FATAL>                         : 
***> abort program execution
Traceback (most recent call last):
  File "tutorial.py", line 390, in <module>
    'H:V:VarTransform=D,G:FilenameModel=model_cnn.h5:FilenameTrainedModel=trained_model_cnn.h5:NumEpochs=10:BatchSize=100')
TypeError: none of the 3 overloaded methods succeeded. Full details:
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader, TString theMethodName, TString methodTitle, TString theOption = "") =>
    TypeError: could not convert argument 2
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader, TMVA::Types::EMVA theMethod, TString methodTitle, TString theOption = "") =>
    runtime_error: FATAL error
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader*, TMVA::Types::EMVA, TString, TString, TMVA::Types::EMVA, TString) =>
    TypeError: takes at least 6 arguments (4 given)

ROOT Version: 6.26/10 Ubuntu 18.04 pre-compiled
Platform: Ubuntu
Compiler: GCC 7.5


Hi @steven1,

I am sure that @moneta can help you with this. My guess is that the tutorial might need to be revised (as I think the call should match the second overload, cited below), no @moneta?

Cheers,
J.

Hi @jalopezg,
Thank you for the reply!
Do you think it currently does not match correctly with the second overload? My perception is that it does match correctly, but does not execute properly hence the fatal error. The same happens when I execute a Pytorch example.

Let’s ping @moneta on that!

Hi,
The tutorial TMVA_CNN_Classification.py should run correctly if you are using the ROOT master version, where we have the correct Pythonization for TMVA.
Now if you are using 6.26 after some changes we can make the tutorial working, but you have made few errors in updating the tutorial. You need to provide the correct string for the options since you cannot use command line arguments in 6.26.
You have to read carefully the error message when using Python, because it can be misleading.

Here is the updated tutorial that should work. I have also removed the transformations like decor relation which will not work with a dataset with so many input features.
If you have some further issues please let me know

Best,

Lorenzo

tutorial.py (13.9 KB)

Hi @moneta,

Thank you for the details on the updated tutorial. However I still get the same fatal error for both pytorch and keras. Does that indicate that my root installation is not correct, since it was working for you? It is the pre-compiled version as mentioned.

Info in <TMVA_CNN_Classification>: Booking convolutional keras model
<FATAL>                         : Unknown method index in map: 26
***> abort program execution
Traceback (most recent call last):
  File "tutorial_new.py", line 390, in <module>
    factory.BookMethod(loader, TMVA.Types.kPyKeras, 'PyKeras',
TypeError: none of the 3 overloaded methods succeeded. Full details:
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader, TString theMethodName, TString methodTitle, TString theOption = "") =>
    TypeError: could not convert argument 2
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader* loader, TMVA::Types::EMVA theMethod, TString methodTitle, TString theOption = "") =>
    runtime_error: FATAL error
  TMVA::MethodBase* TMVA::Factory::BookMethod(TMVA::DataLoader*, TMVA::Types::EMVA, TString, TString, TMVA::Types::EMVA, TString) =>
    TypeError: takes at least 6 arguments (4 given)

Hi,

It looks you are having this error:

are you having this line in your code ?

TMVA.PyMethodBase.PyInitialize()

Lorenzo

Hi Lorenzo,

I do have that line. Actually

print(TMVA.PyMethodBase.PyIsInitialized())

returns 1.

Other than the warnings and the error I send earlier it does not output any clues on what is going wrong.
I feel like the amount of output I get from the terminal is minimal or suppressed, even though I have verbose mode on?

Hi
Can you please send the full log file?
Cheers

Lorenzo

Hi Lorenzo,

The mismatch in GCC compiler ( GCC 7.5 ROOT and GCC 9.4 on my system) is causing a problem. Therefore I have switched to building ROOT from source. This will also make it easier to debug the C code since I can recompile the code with added cout statements.

I get the same Fatal error after building from source:

Fatal error log:
fatal_error_log.txt (4.7 KB)

Not sure if these are relevant:
cmake build command and output:

cmake  -DCMAKE_INSTALL_PREFIX="../root_install" -DPYTHON="ON" -DPYTHON_INCLUDE_DIR="/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/include/python3.6m" -DPYTHON_LIBRARY="/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/lib/python3.6/" -DPython3_EXECUTABLE="/opt/vitis_ai/conda/envs/vitis-ai-tensorflow/bin/python3" -Dpyroot="ON" -Dtmva="ON" -Dtmva-cpu="ON" -Dtmva-pymva="ON" /home/mtower/Desktop/Steven/root_src 

cmakeoutput.txt (13.8 KB)

files generated by the build:
CMakeError.log:
CMakeOutput.txt (185.1 KB)
CMakeOuput.log:
CMakeOutput.txt (185.1 KB)

Hi,
Looking at the log it seems to me you are having an error in passing the option string when booking the Keras method.

Traceback (most recent call last):
  File "tutorial_new.py", line 391, in <module>
    'H:!V:VarTransform=None:FilenameModel=model_cnn.h5:FilenameTrainedModel=trained_model_cnn.h5:NumEpochs=10:BatchSize=100')

I would need to see exactly your code to see the problem. Maybe there are some hidden characters in the string inserted by mistake, please check it carefully.

Cheers

Lorenzo

After adding several logging statements in the source code I found out the problem occured in the SetupKerasModel() function inside the MethodPyKeras.cxx file. Turns out the Tensorflow version I have on my system is 1.15, which is too old and not compatible with tmva ROOT. I updated Tensorflow to version 2.8 which solved the problem. Now the Fatal error does not occur anymore. I do find it strange that the logging statements about the Keras version did not show that there was a version mismatch.
Line 209 in MethodPyKeras.cxx does not trigger when keras is present but not the right version. Maybe a else statement needs to be added as a fail safe with Logging.

Thank you a lot for the efforts @moneta.

Hi,

I think it should work with old Tensorflow models, but you need to provide first a model with the compatible Tensorflow version.
In addition if you are using Tensorflow version 1, you might need to set the option when booking, `!tfkeras’
It is however strange that this was not reported in the error message.

Lorenzo