Incompatibility among tensorflow of lxplus with tensorflow macbook pro, colab notebook and SWAN

fernandoaugusto12 · September 11, 2023, 6:59pm

Hello everybody,
I would like to request help from experts for the following problem:
I use neural networks in VBF signal analysis, a Multilayer perceptron, which saves the model as follows:
##############################################

from tensorflow.keras import utils
from tensorflow.python.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint

#callbacks–> criteria to earlyStopping and best configuration
es = EarlyStopping(monitor=‘loss’, min_delta=1e-10, patience=10, verbose=1)
rlr = ReduceLROnPlateau(monitor= ‘loss’, factor= 0.2, patience = 5, verbose=1)
mcp = ModelCheckpoint(filepath=‘BN_tf230_64_all_2j_3j.h5’, monitor=‘loss’, save_best_only=True, verbose=1)

Number of training epochs

nepochs=6

Batch size

batch=64

Trainclassifier

history = model.fit(X_train_val,
Y_train_val,
epochs=nepochs,
sample_weight=W_train_val,
batch_size=batch,
verbose=1, # switch to 1 for more verbosity
validation_split=0.3,callbacks=[mcp])
##############################################
Then I get the best result from the network and use it to add variables to my tree, the result that goes to combine. I was unable to build a satisfactory network model in lxplus, so I made a model on my computer and imported it into my lxplus. Over the last year there were no problems. I used tensorflow 2.2 and python3.8, but when I needed to use BatchNormalization to improve the p-value and significance. BatchNormalization was not compatible with tensorflow 2.2 and I upgraded the version, but it became incompatible with use in lxplus with answers like:
#############################################
[fassunca@lxplus749 private]$ cat errors.2383655.0.err
Traceback (most recent call last):
File “To_combine_add_NNVBF_2j_3j_2018_642.py”, line 59, in
model = load_model(‘BN_tf230_64_all_2j_3j.h5’, compile = False)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/saving.py”, line 234 , in load_model
model = model_from_config(model_config, custom_objects=custom_objects)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/saving.py”, line 324 , in model_from_config
return deserialize(config, custom_objects=custom_objects)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py”, line 74 , in deserialize
printable_module_name=‘layer’)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py”, line 192 , in deserialize_keras_object
list(custom_objects.items())))
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py”, line 349 , in from_config
custom_objects=custom_objects)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py”, line 74 , in deserialize
printable_module_name=‘layer’)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py”, line 194 , in deserialize_keras_object
return cls.from_config(cls_config)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py”, line 402 , in from_config
return cls(**config)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py”, line 69 , in init
raise ValueError(‘Unrecognized keyword arguments:’, kwargs.keys())
ValueError: (‘Unrecognized keyword arguments:’, dict_keys([‘ragged’]))
#############################################
I tried using Jupyter Notebook and SWAN and the problem repeated itself
I use condor with following source:
############################################
#!/bin/bash
source /cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/setup.sh
cd /afs/cern.ch/work/f/fassunca/private/

python To_combine_add_NNVBF_2j_3j_2018_642.py

########################################

I couldn’t run a neural network on lxplus and I need some tips to solve this problem, so has anyone come across something similar or could give me a tip? Thanks in advance
cheers
Fernando Assuncao

vpadulan · September 18, 2023, 7:57am

Dear @fernandoaugusto12

Thank you for the report. Let me invite @moneta in the discussion , maybe he has already seen something like this in the past.

Cheers,
Vincenzo

moneta · September 18, 2023, 9:25am

Hi,
It looks to me you are using a very old setup in lxplus (LC_96). This is few years old and contains version 1 of Tensorflow, which is not compatible with your model. I would strongly recommend to use a newer tensorflow version of lxplus, for example I see that LCG_98 has version 2.1 of tensorflow,
try for example to source /cvmfs/sft.cern.ch/lcg/views/LCG_98py3cu10/x86_64-centos7-gcc8-opt/setup.sh

Lorenzo

fernandoaugusto12 · September 18, 2023, 12:55pm

Hello Lorenzo, thank you very much!
Could you tell me the latest version of tensorflow and what the source is?
Thank you very much in advance!

fernandoaugusto12 · September 18, 2023, 12:59pm

vpadulan thank you very much for the answer, I managed to create an entire virtual environment with miniconda, but the problems persist, however I believe that the answer provided by lorenzo is the answer, although I think it is necessary to have the most up-to-date version of tensorflow possible.

moneta · September 18, 2023, 2:01pm

Hi,

I think in SWAN the latest version you can have is 2.8. Using LCG_98 (as above) you have tensorflow 2.1.

Lorenzo

fernandoaugusto12 · September 18, 2023, 2:03pm

And to use 2.8, what is the source?
thank you again

fernandoaugusto12 · September 19, 2023, 1:48pm

My friends, thank you very much! My problem was solved

fernandoaugusto12 · September 19, 2023, 1:51pm

Thanks for answering my question! My problem has been resolved