Incompatibility among tensorflow of lxplus with tensorflow macbook pro, colab notebook and SWAN

Hello everybody,
I would like to request help from experts for the following problem:
I use neural networks in VBF signal analysis, a Multilayer perceptron, which saves the model as follows:
##############################################

from tensorflow.keras import utils
from tensorflow.python.keras.callbacks import EarlyStopping, ReduceLROnPlateau, ModelCheckpoint

#callbacks–> criteria to earlyStopping and best configuration
es = EarlyStopping(monitor=‘loss’, min_delta=1e-10, patience=10, verbose=1)
rlr = ReduceLROnPlateau(monitor= ‘loss’, factor= 0.2, patience = 5, verbose=1)
mcp = ModelCheckpoint(filepath=‘BN_tf230_64_all_2j_3j.h5’, monitor=‘loss’, save_best_only=True, verbose=1)

Number of training epochs

nepochs=6

Batch size

batch=64

Trainclassifier

history = model.fit(X_train_val,
Y_train_val,
epochs=nepochs,
sample_weight=W_train_val,
batch_size=batch,
verbose=1, # switch to 1 for more verbosity
validation_split=0.3,callbacks=[mcp])
##############################################
Then I get the best result from the network and use it to add variables to my tree, the result that goes to combine. I was unable to build a satisfactory network model in lxplus, so I made a model on my computer and imported it into my lxplus. Over the last year there were no problems. I used tensorflow 2.2 and python3.8, but when I needed to use BatchNormalization to improve the p-value and significance. BatchNormalization was not compatible with tensorflow 2.2 and I upgraded the version, but it became incompatible with use in lxplus with answers like:
#############################################
[fassunca@lxplus749 private]$ cat errors.2383655.0.err
Traceback (most recent call last):
File “To_combine_add_NNVBF_2j_3j_2018_642.py”, line 59, in
model = load_model(‘BN_tf230_64_all_2j_3j.h5’, compile = False)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/saving.py”, line 234 , in load_model
model = model_from_config(model_config, custom_objects=custom_objects)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/saving.py”, line 324 , in model_from_config
return deserialize(config, custom_objects=custom_objects)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py”, line 74 , in deserialize
printable_module_name=‘layer’)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py”, line 192 , in deserialize_keras_object
list(custom_objects.items())))
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/sequential.py”, line 349 , in from_config
custom_objects=custom_objects)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/layers/serialization.py”, line 74 , in deserialize
printable_module_name=‘layer’)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/utils/generic_utils.py”, line 194 , in deserialize_keras_object
return cls.from_config(cls_config)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/base_layer.py”, line 402 , in from_config
return cls(**config)
File “/cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/lib/python3.6/site-packages/tensorflow/python/keras/engine/input_layer.py”, line 69 , in init
raise ValueError(‘Unrecognized keyword arguments:’, kwargs.keys())
ValueError: (‘Unrecognized keyword arguments:’, dict_keys([‘ragged’]))
#############################################
I tried using Jupyter Notebook and SWAN and the problem repeated itself
I use condor with following source:
############################################
#!/bin/bash
source /cvmfs/sft.cern.ch/lcg/views/LCG_96py3cu10/x86_64-centos7-gcc7-opt/setup.sh
cd /afs/cern.ch/work/f/fassunca/private/

python To_combine_add_NNVBF_2j_3j_2018_642.py

########################################

I couldn’t run a neural network on lxplus and I need some tips to solve this problem, so has anyone come across something similar or could give me a tip? Thanks in advance
cheers
Fernando Assuncao

Dear @fernandoaugusto12

Thank you for the report. Let me invite @moneta in the discussion , maybe he has already seen something like this in the past.

Cheers,
Vincenzo

Hi,
It looks to me you are using a very old setup in lxplus (LC_96). This is few years old and contains version 1 of Tensorflow, which is not compatible with your model. I would strongly recommend to use a newer tensorflow version of lxplus, for example I see that LCG_98 has version 2.1 of tensorflow,
try for example to source /cvmfs/sft.cern.ch/lcg/views/LCG_98py3cu10/x86_64-centos7-gcc8-opt/setup.sh

Lorenzo

Hello Lorenzo, thank you very much!
Could you tell me the latest version of tensorflow and what the source is?
Thank you very much in advance!

vpadulan thank you very much for the answer, I managed to create an entire virtual environment with miniconda, but the problems persist, however I believe that the answer provided by lorenzo is the answer, although I think it is necessary to have the most up-to-date version of tensorflow possible.

Hi,

I think in SWAN the latest version you can have is 2.8. Using LCG_98 (as above) you have tensorflow 2.1.

Lorenzo

And to use 2.8, what is the source?
thank you again

My friends, thank you very much! My problem was solved :slight_smile:

Thanks for answering my question! My problem has been resolved