How to get training history using TMVA Keras interface?

Dear TMVA users,

I am training a deep multiclass classifier using the Keras interface and I was wondering if there is an easy way to obtain the training history in order to create monitoring plots of e.g. loss as a function of epochs, something akin to keras.callbacks.History().

From what I can see here:

When Keras’ model.fit() method is called it should return a History object however, I’m not sure how to grab it as I think this is all performed by calling TMVA::factory::TrainAllMethods() which returns type void.

Thanks in advance,

Josh

Hi Josh,

You can access the TensorBoard callback such as shown in here.

Here’s a code example you can try with the example root/tutorials/tmva/keras/ClassificationKeras.py.

 factory.BookMethod(dataloader, TMVA.Types.kPyKeras, 'PyKeras',             
    'H:!V:VarTransform=D,G:FilenameModel=model.h5:NumEpochs=20:BatchSize=32:Tensorboard=./logs')

The output should look like this:

Cheers
Stefan

1 Like

Hi Stefan,

Thanks for the tip. I’ve updated keras and tensorflow as the PR suggests but it looks like I may need to update my version of ROOT/TMVA as the ‘BookMethod’ method in my current version has no ‘Tensorboard’ option. My current env. is from:

source /cvmfs/sft.cern.ch/lcg/views/LCG_91/x86_64-slc6-gcc62-opt/setup.sh

Cheers,

Josh

Hi Josh,

The LCG93 stack currently is the most recent and includes this option. Just replace LCG_91 with LCG_93!

Cheers
Stefan

Hi,

I’ve tried this and I get the same error. Sourcing the LCG_93 build bring ROOT 6.12/06, is this the correct ROOT version?

Cheers,

Hi,

I was confident that it is in LCG93 because the feature was merged in February but you are right that it’s not in, sry for that. The release tag has been just a few weeks before (see here).

Either you source a stack from the nightly builds with source /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/latest/x86_64-centos7-gcc62-opt/setup.sh or you source a more recent ROOT version following the notes here (or you build it from source).

Cheers
Stefan

I tried my setup as before (python 2.7 and the relevant package updates mentioned in the PR) and sourcing the nightlies you suggest:

source /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/latest/x86_64-slc6-gcc62-opt/setup.sh

However I now see the following new error:

<WARNING>                : Failed to run python code: callbacks.append(keras.callbacks.TensorBoard(log_dir='./logs', histogram_freq=0, batch_size=batchSize, write_graph=True, write_grads=False, write_images=False))

as shown by the stack trace beneath:

  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/keras/callbacks.py", line 637, in __init__
    from tensorflow.contrib.tensorboard.plugins import projector
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/__init__.py", line 72, in <module>
    from tensorflow.contrib import tensor_forest
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/__init__.py", line 21, in <module>
    from tensorflow.contrib.tensor_forest.client import *
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/client/__init__.py", line 22, in <module>
    from tensorflow.contrib.tensor_forest.client import random_forest
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/client/random_forest.py", line 28, in <module>
    from tensorflow.contrib.tensor_forest.python import tensor_forest
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/python/__init__.py", line 21, in <module>
    from tensorflow.contrib.tensor_forest.python import tensor_forest
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/python/tensor_forest.py", line 29, in <module>
    from tensorflow.contrib.tensor_forest.python.ops import data_ops
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/python/ops/data_ops.py", line 20, in <module>
    from tensorflow.contrib.tensor_forest.python.ops import tensor_forest_ops
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/ROOT.py", line 461, in _importhook
    return _orig_ihook( name, *args, **kwds )
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/python/ops/tensor_forest_ops.py", line 28, in <module>
    resource_loader.get_path_to_datafile('_tensor_forest_ops.so'))
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/util/loader.py", line 55, in load_op_library
    ret = load_library.load_op_library(path)
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
    lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Thu/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in __exit__
    c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: libforestprotos.so: cannot open shared object file: No such file or directory
<FATAL>                         : Failed to setup training callback: TensorBoard
***> abort program execution
Traceback (most recent call last):
  File "train_DNN.py", line 180, in <module>
    main()
  File "train_DNN.py", line 175, in main
    factory.TrainAllMethods()
Exception: void TMVA::Factory::TrainAllMethods() =>
    FATAL error (C++ exception of type runtime_error)

Unfortunately there is not much we can do at this point. libforestprotos.so is part of tensorflow/contrib/tensor_forest, which is not build with the LCG release so far. I’ve forwarded the problem to the according person who makes the builds.

Though you can always install user-specific packages via pip with pip install --user PACKAGE. Doing this pip pulls the libforestprotos.so to $HOME/.local/python2.7/site-packages/tensorflow/contrib/tensor_forest, but it is build with a wrong libc/protobuf version so that we can’t use it. As well, I’ve tried on centos7 systems (lxplus7.cern.ch), which faces the same problem.

Ok, thanks for all your help! Do you have a link to an error report or anything so I can follow the progress?

The guy building the LCG releases proposed a solution. Actually, the library is there but not correctly linked. The issue should be fixed in the next days for the nightlies. For you now, it is simply enough to add the correct path to the LD_LIBRARY_PATH, doing this:

source /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/latest/x86_64-slc6-gcc62-opt/setup.sh
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/latest/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorflow/contrib/tensor_forest/

I could run on lxplus.cern.ch the example with the tensorboard callback.

Ok, thanks that worked for me as well. However trying to run Tensorboard in order to visualise the data it seems some plugins are missing from the nightlies python2.7:

  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Fri/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorboard/default.py", line 35, in <module>
    from tensorboard.plugins.audio import audio_plugin
  File "/cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/Fri/x86_64-slc6-gcc62-opt/lib/python2.7/site-packages/tensorboard/plugins/audio/audio_plugin.py", line 23, in <module>
    from werkzeug import wrappers
ImportError: No module named werkzeug

It does work for me:

[swunsch@lxplus036 ~]$ source /cvmfs/sft-nightlies.cern.ch/lcg/views/dev3/latest/x86_64-slc6-gcc62-opt/setup.sh
[swunsch@lxplus036 ~]$ python
Python 2.7.13 (default, Dec  5 2017, 19:29:24) 
[GCC 6.2.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import werkzeug
>>> werkzeug.__version__
'0.14.1'
>>> 

Here the output retrieved via remote from lxplus (see the url):

http://ekpwww.etp.kit.edu/~wunsch/misc/root_forum_lxplus_tensorboard.png

We just discovered that werkzeug is actually not in the LCG release. It will be included in the nightlies from tomorrow. However, you can install it locally doing pip install --user werkzeug, which is picked up automatically by python (even without setting the PYTHONPATH!).

Amazing! All working perfectly and can retrieve the output remotely via the browser.

Thanks a lot,

Josh

1 Like

Hi @swunsch ,

Seeing as this is related to this topic, I thought I would simply continue this thread. The outputs and plots I obtain using the above provide network plots in the “scalars” and “graphs” however I was wondering if there is a way to save the histogram data to my event files, as required if I want plots in the “histogram” dashboard?

Thanks in advance,

Josh

Hi Josh,

This is not supported by the wrapping because it has a lot of impliciations such as slowing down the training quite significantly (dependent on the parameters). So the keras.callbacks.TensorBoard object is set up with histogram_freq=0 (histograms of weights/activations disabeled).

Cheers
Stefan

Hi @swunsch,

Could you put me in touch with the person building the LCG releases please? I have another issue with TensorFlow / LCG on lxplus however I’m not sure this is a ROOT / TMVA issue anymore.

Thanks a lot,

Josh

Check your CERN mattermost!