Hi, ROOT experts,
I was trying to run the tutorial macro ClassificationKeras.py in tutorials/tmva/keras/, but I got errors. Can someone give me some ideas about how to make it work?
This part shows what happened before it failed:
2023-04-01 12:00:05.018837: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
DataSetInfo : [dataset] : Added class "Signal"
: Add Tree TreeS of type Signal with 6000 events
DataSetInfo : [dataset] : Added class "Background"
: Add Tree TreeB of type Background with 6000 events
: Dataset[dataset] : Class index : 0 name : Signal
: Dataset[dataset] : Class index : 1 name : Background
2023-04-01 12:00:10.683788: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Metal device set to: Apple M1
systemMemory: 8.00 GB
maxCacheSize: 2.67 GB
2023-04-01 12:00:10.685012: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-04-01 12:00:10.685312: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 64) 320
dense_1 (Dense) (None, 2) 130
=================================================================
Total params: 450
Trainable params: 450
Non-trainable params: 0
_________________________________________________________________
Factory : Booking method: Fisher
:
Fisher : [dataset] : Create Transformation "D" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Fisher : [dataset] : Create Transformation "G" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : Booking method: PyKeras
:
PyKeras : [dataset] : Create Transformation "D" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
PyKeras : [dataset] : Create Transformation "G" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
: Setting up tf.keras
: Using TensorFlow version 2
: Use Keras version from TensorFlow : tf.keras
2023-04-01 12:00:10.962627: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2023-04-01 12:00:10.962648: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)
: Loading Keras Model
: Loaded model from file: model.h5
Factory : Train all methods
: Rebuilding Dataset dataset
: Building event vectors for type 2 Signal
: Dataset[dataset] : create input formulas for tree TreeS
: Building event vectors for type 2 Background
: Dataset[dataset] : create input formulas for tree TreeB
DataSetFactory : [dataset] : Number of events in input trees
:
:
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 4000
: Signal -- testing events : 2000
: Signal -- training and testing events: 6000
: Background -- training events : 4000
: Background -- testing events : 2000
: Background -- training and testing events: 6000
:
DataSetInfo : Correlation matrix (Signal):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.391 +0.590 +0.813
: var2: +0.391 +1.000 +0.692 +0.734
: var3: +0.590 +0.692 +1.000 +0.851
: var4: +0.813 +0.734 +0.851 +1.000
: ----------------------------------------
DataSetInfo : Correlation matrix (Background):
: ----------------------------------------
: var1 var2 var3 var4
: var1: +1.000 +0.855 +0.914 +0.965
: var2: +0.855 +1.000 +0.927 +0.936
: var3: +0.914 +0.927 +1.000 +0.970
: var4: +0.965 +0.936 +0.970 +1.000
: ----------------------------------------
DataSetFactory : [dataset] :
:
Factory : [dataset] : Create Transformation "D" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
Factory : [dataset] : Create Transformation "G" with events from all classes.
:
: Transformation, Variable selection :
: Input : variable 'var1' <---> Output : variable 'var1'
: Input : variable 'var2' <---> Output : variable 'var2'
: Input : variable 'var3' <---> Output : variable 'var3'
: Input : variable 'var4' <---> Output : variable 'var4'
: Preparing the Decorrelation transformation...
: Preparing the Gaussian transformation...
TFHandler_Factory : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.0084120 1.0019 [ -3.1195 5.7307 ]
: var2: 0.0078511 0.99981 [ -3.1195 5.7307 ]
: var3: 0.0083128 1.0011 [ -3.1195 5.7307 ]
: var4: 0.0076997 0.99886 [ -3.1195 5.7307 ]
: -----------------------------------------------------------
Factory : Train method: Fisher for Classification
:
: Preparing the Decorrelation transformation...
: Preparing the Gaussian transformation...
TFHandler_Fisher : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.0084120 1.0019 [ -3.1195 5.7307 ]
: var2: 0.0078511 0.99981 [ -3.1195 5.7307 ]
: var3: 0.0083128 1.0011 [ -3.1195 5.7307 ]
: var4: 0.0076997 0.99886 [ -3.1195 5.7307 ]
: -----------------------------------------------------------
Fisher : Results for Fisher coefficients:
: NOTE: The coefficients must be applied to TRANFORMED variables
: List of the transformation:
: -- Deco
: -- Gauss
: -----------------------
: Variable: Coefficient:
: -----------------------
: var1: -0.221
: var2: -0.055
: var3: +0.032
: var4: +0.474
: (offset): -0.002
: -----------------------
: Elapsed time for training with 8000 events: 0.0319 sec
Fisher : [dataset] : Evaluation of Fisher on training sample (8000 events)
: Elapsed time for evaluation of 8000 events: 0.0153 sec
: Creating xml weight file: dataset/weights/TMVAClassification_Fisher.weights.xml
: Creating standalone class: dataset/weights/TMVAClassification_Fisher.class.C
Factory : Training finished
:
Factory : Train method: PyKeras for Classification
:
:
: ================================================================
: H e l p f o r M V A m e t h o d [ PyKeras ] :
:
: Keras is a high-level API for the Theano and Tensorflow packages.
: This method wraps the training and predictions steps of the Keras
: Python package for TMVA, so that dataloading, preprocessing and
: evaluation can be done within the TMVA system. To use this Keras
: interface, you have to generate a model with Keras first. Then,
: this model can be loaded and trained in TMVA.
:
:
: <Suppress this message by specifying "!H" in the booking option>
: ================================================================
:
: Preparing the Decorrelation transformation...
: Preparing the Gaussian transformation...
TFHandler_PyKeras : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.0084120 1.0019 [ -3.1195 5.7307 ]
: var2: 0.0078511 0.99981 [ -3.1195 5.7307 ]
: var3: 0.0083128 1.0011 [ -3.1195 5.7307 ]
: var4: 0.0076997 0.99886 [ -3.1195 5.7307 ]
: -----------------------------------------------------------
TFHandler_PyKeras : Variable Mean RMS [ Min Max ]
: -----------------------------------------------------------
: var1: 0.0084120 1.0019 [ -3.1195 5.7307 ]
: var2: 0.0078511 0.99981 [ -3.1195 5.7307 ]
: var3: 0.0083128 1.0011 [ -3.1195 5.7307 ]
: var4: 0.0076997 0.99886 [ -3.1195 5.7307 ]
: -----------------------------------------------------------
: Split TMVA training data in 6400 training events and 1600 validation events
: Training Model Summary
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 64) 320
dense_1 (Dense) (None, 2) 130
=================================================================
Total params: 450
Trainable params: 450
Non-trainable params: 0
_________________________________________________________________
: Option SaveBestOnly: Only model weights with smallest validation loss will be stored
And these are the errors:
Epoch 1/20
2023-04-01 12:01:56.931210: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:114] Plugin optimizer for device_type GPU is enabled.
2023-04-01 12:01:57.254187: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x7fc138f94630
2023-04-01 12:01:57.254384: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x7fc138f94630
2023-04-01 12:01:57.270410: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x7fc138f94630
2023-04-01 12:01:57.270438: W tensorflow/core/framework/op_kernel.cc:1830] OP_REQUIRES failed at xla_ops.cc:418 : NOT_FOUND: could not find registered platform with id: 0x7fc138f94630
<WARNING> : Failed to run python code: history = model.fit(trainX, trainY, sample_weight=trainWeights, batch_size=batchSize, epochs=numEpochs, verbose=verbose, validation_data=(valX, valY, valWeights), callbacks=callbacks)
<WARNING> : Python error message:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 70, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/tensorflow/python/eager/execute.py", line 52, in quick_execute
tensors = pywrap_tfe.TFE_Py_Execute(ctx._handle, device_name, op_name,
tensorflow.python.framework.errors_impl.NotFoundError: Graph execution error:
Detected at node 'StatefulPartitionedCall_2' defined at (most recent call last):
File "/Users/martin/Desktop/tmvaTutorials/keras/ClassificationKeras.py", line 74, in <module>
factory.TrainAllMethods()
File "<string>", line 1, in <module>
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/utils/traceback_utils.py", line 65, in error_handler
return fn(*args, **kwargs)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/engine/training.py", line 1650, in fit
tmp_logs = self.train_function(iterator)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/engine/training.py", line 1249, in train_function
return step_function(self, iterator)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/engine/training.py", line 1233, in step_function
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/engine/training.py", line 1222, in run_step
outputs = model.train_step(data)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/engine/training.py", line 1027, in train_step
self.optimizer.minimize(loss, self.trainable_variables, tape=tape)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 527, in minimize
self.apply_gradients(grads_and_vars)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1140, in apply_gradients
return super().apply_gradients(grads_and_vars, name=name)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 634, in apply_gradients
iteration = self._internal_apply_gradients(grads_and_vars)
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1166, in _internal_apply_gradients
return tf.__internal__.distribute.interim.maybe_merge_call(
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1216, in _distributed_apply_gradients_fn
distribution.extended.update(
File "/Users/martin/opt/anaconda3/envs/tensorflow/lib/python3.10/site-packages/keras/optimizers/optimizer_experimental/optimizer.py", line 1211, in apply_grad_to_update_var
return self._update_step_xla(grad, var, id(self._var_key(var)))
Node: 'StatefulPartitionedCall_2'
could not find registered platform with id: 0x7fc138f94630
[[{{node StatefulPartitionedCall_2}}]] [Op:__inference_train_function_26145]
<FATAL> : Failed to train model
***> abort program execution
Traceback (most recent call last):
File "/Users/martin/Desktop/tmvaTutorials/keras/ClassificationKeras.py", line 74, in <module>
factory.TrainAllMethods()
cppyy.gbl.std.runtime_error: void TMVA::Factory::TrainAllMethods() =>
runtime_error: FATAL error
P.S.
I’ve changed one of the lines from
model.compile(loss='categorical_crossentropy',optimizer=SGD(lr=0.01), metrics=['accuracy', ])
to
model.compile(tf.keras.optimizers.experimental.SGD(learning_rate=0.01),loss=tf.keras.losses.CategoricalCrossentropy(),metrics=['accuracy',])
in order to make my TensorFlow-Keras work.
And here is my slightly adjusted macro:
ClassificationKeras.py (2.3 KB)
Thank you!