[Fix] TMVA SOFIE ONNX Parser missing in ROOT 6.36 build on Ubuntu 24.04 (WSL2)

SHLOK_SAXENA · December 8, 2025, 5:14pm

Hi everyone,

I wanted to share a solution for a build issue I encountered while compiling ROOT v6.36 (latest source) on Ubuntu 24.04 (WSL2).

The Issue

Even after installing protobuf-compiler and libprotobuf-dev, and enabling -Dtmva-sofie=ON, the build would complete successfully, but the RModelParser_ONNX class was missing from the TMVA::Experimental::SOFIE namespace.

When running the SOFIE ONNX tutorials, this resulted in the compiler error: error: no type named 'RModelParser_ONNX' in namespace 'TMVA::Experimental::SOFIE'

The Cause

It appears that CMake fails to correctly propagate the Protobuf include/library paths to the internal TMVA build flags in this environment. Although CMake logs “Found Protobuf,” the internal preprocessor guards likely fail, causing the ONNX implementation to be silently skipped during the compilation of libTMVA.so.

The Solution:
I successfully fixed this by explicitly passing the library paths to CMake to force detection. If anyone else faces this on a modern Linux distro, here is the working configuration:

cmake ../root_src
-Dtmva-sofie=ON
-DProtobuf_LIBRARY=/usr/lib/x86_64-linux-gnu/libprotobuf.so
-DProtobuf_INCLUDE_DIR=/usr/include

(Verify your specific libprotobuf path using find /usr -name libprotobuf.so if it differs).

After a clean rebuild (make -j4), the ONNX parser is correctly linked and the tutorials run perfectly.

Hope this saves some time for others!

Best, Shlok.

mczurylo · December 9, 2025, 1:59pm

Hi @SHLOK_SAXENA,

thank you for your question, maybe @moneta could help here?

Cheers,

Marta

moneta · December 9, 2025, 5:11pm

Hi Shlok,

Thank you for posting the solution! I don’t know what could be causing cmake not to find protobuf automatically; I would need to see the log file, but it is great you posted the solution which works

Best,

Lorenzo

SHLOK_SAXENA · December 9, 2025, 6:02pm

Hi Lorenzo,

Thank you for the response!

I checked my build logs. Since I am using a newer version of CMake (on Ubuntu 24.04), it generated a CMakeConfigureLog.yaml instead of the traditional output/error logs. I attempted to attach the file, but the forum is currently restricting attachments for my new account.

However, I can confirm that manually passing the paths forced the correct configuration. The CMakeCache now explicitly shows:

Protobuf_INCLUDE_DIR:PATH=/usr/include Protobuf_LIBRARY_RELEASE:FILEPATH=/usr/lib/x86_64-linux-gnu/libprotobuf.so

On a separate note: While exploring the source code to fix this, I noticed that the current ONNX implementation lacks support for Transformer operators (like MultiHeadAttention, GELU, and LayerNormalization). I know these are becoming standard in HEP for tasks like particle tagging (ParT).

I would be very interested in helping to implement these. Do you think adding Transformer support would be a suitable topic for a GSoC project this coming year? I would love to work on it.

Best,

Shlok

moneta · December 9, 2025, 6:10pm

Hi,

I will try to check with the new ububuntu and see if I can reproduce the error.

For the implementation of new operators, we support Transformer models, like the ATLAS_GN2 model or CaloDiT2 for diffusion, but the Transformer block (the Attention layer) is impelmented as a series of several ONNX operators. We could implement directly the MultiHead Attention, if needed.
Normally, when exporting from PyTorch to ONNX you get a series of lower level ONNX operators.
LayerNormalization is supported, but probably in 6.38. GELU can also easly be added if needed.
Yes, sure adding new operators could be a suitable GSOC project, you are welcomed to apply for it.

Thank you for the interest!

Best,

Lorenzo

SHLOK_SAXENA · December 9, 2025, 6:52pm

Hi,

Thank you for the detailed clarification and the encouragement!

That makes perfect sense regarding the current decomposition approach. However, I suspect that implementing a dedicated (fused) MultiHeadAttention operator would offer significant performance benefits over running it as a series of smaller operations (MatMul, Softmax, etc.).

A fused kernel could reduce memory access overhead and generate cleaner C++ code for modern architectures like CaloDiT2.

Since LayerNormalization is already planned for 6.38, I would propose focusing my GSoC project on:

Implementing the fused MultiHeadAttention operator for high-performance inference.
Adding GELU (and potentially other missing activations like Einsum if needed).
Benchmarking the new “Fused” implementation against the current “Decomposed” approach to quantify the speedup.

I will start studying the RModelParser_ONNX source code to familiarize myself with the operator registration system. If there are any specific files or benchmarks you recommend I look at first, I would appreciate the guidance!

Best,

Shlok

system · December 23, 2025, 6:52pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.