Use of torch model inside ROOT dataframe class-functor

Dear ROOT experts,

I am trying to use a torch model and evalute the response using the TORCH C++ API.

All works find in a simple executable in C++ doing :

       module = torch::jit::load("mymodel.pt");
       std::vector<float> input_data = { .. , .., .., ..} ;
       torch::Tensor input_tensor = torch::from_blob(
          input_data.data(),
           {1, 4},
           torch::kFloat32
       );
    
    std::cout << "Input tensor:\n" << input_tensor << std::endl;

    // Execute the model and turn off gradients
    at::AutoGradMode guard(false);
    auto output = module.forward({input_tensor}).toTensor();

    // Output the result
    std::cout << "Output: " << output.squeeze() << std::endl;

Now i am trying to move the module inside a RDataFrame class with operator() function.

When i do so , i get conflicts because

#include <torch/torch.h>

#include <torch/script.h>

there is a clash on ClassDef that the ROOT_DICTIONARY_GENERATE use.

is there a work around for this?
this is the error i see,

opt/homebrew/lib/python3.12/site-packages/torch/include/torch/csrc/jit/frontend/tree_views.h:460:53: error: call to non-static member function without an object argument
  explicit ClassDef(const TreeRef& tree) : TreeView(tree) {
                                                    ^~~~
/opt/homebrew/lib/python3.12/site-packages/torch/include/torch/csrc/jit/frontend/tree_views.h:463:35: error: too few arguments provided to function-like macro invocation
  explicit ClassDef(TreeRef&& tree) : TreeView(std::move(tree)) {

opening the file it seems the issue is because boot ROOT and jit::torch uses ClassDef keywords.

Best,
Renato

Hello @rquaglia,
@vpadulan or @mczurylo can probably help you with this.

THank you, i did something which i am not sure is safe, but i opened the locally installed

tree_views.h from the torch installed include folders file and replaced all ClassDefClassDefTorch and my local code can compile…

I have no idea if what i did is legal within the TORCH jit , will report if i see some misbehaviour doing this change on a header file installed on the system

Hi Renato,

Torch+RDF in C++: nice mix.

Thanks for sharing the workaround. Could you share also the piece of cpp code that reproduces the conflict?

I try to do some guesswork. Could you try to include ROOT headers before torch ones and wrap the two headers above in this construct?

#pragma push_macro("ClassDef")
#undef ClassDef
#include <torch/torch.h>
#include <torch/script.h>
#pragma pop_macro("ClassDef")

I hope this help. Let us know how your inference with RDF+torch goes!

Cheers,
D

Hi @Danilo , i have not yet tested it inside a code yet, but i think a simplification of the code is the following.

I can confirm your trick on ClassDef works and the code successfully compile , i have not yet tested the functionality, but it should work. Any suggestion ( if any expert on torch on C++ read this to make things faster to compute the variables is welcome).

But, there is a good documentation to “trace” a model and dump it into loadable script in C++ in pytorch whcih i used upfront which looks like this :

        example_input = torch.from_numpy( 
            np.array( [ [ .... example input list ] ], 
        dtype=np.float32)).float()
        pred_val = model(example_input).squeeze().numpy()
        print(pred_val)
        # trace and dump
        traced_model  = torch.jit.trace(model, example_input)
        # print ("save")
        torch.jit.save(traced_model, "mymodel.pt")

Then i coded a class doing something like this

#pragma once 
#include <map> 
#include <ROOT/RDataFrame.hxx>
#include "ROOT/RVec.hxx"
#pragma push_macro("ClassDef")
#undef ClassDef
#include <torch/torch.h>
#include <torch/script.h>
#pragma pop_macro("ClassDef")

class MyModelAttacher{ 
    public: 
    MyModelAttacher() = default;
    MyModelAttacher( std::string path_model){ 
         // Execute the model and turn off gradients
        at::AutoGradMode guard(false);
        model = torch::jit::load( path_model.c_str() );
    } ;
    double operator() ( const .... inputs columns){ 
          std::vector<double> input_data = { ... from columns };
          torch::Tensor input_tensor = torch::from_blob(
          input_data.data(),
           {1, nVariables Model }, // has to match it
             torch::kFloat32
         );
         auto output = module.forward({input_tensor}).toTensor();
         double results = output.squeeze(); 
         return result;
    } 
    private : 
        torch::jit::Module model;  //!  // ROOT ignores this member 
} 

I needed some dirty hacks on the CMakeLists.txt of my project on MacOs to find torch library but basically i followed this link

I don’t know if that makes sense at all, but have python-models MVA “Defineable” with RDataFrame operations sounds attractive (at least to me, and i don’t have to do some nasty gymnastic on python with numba declaring etc…

Hi Renato,

Happy to see the pragmas worked for you.
I think high performance inference from C++ is a common problem in HEP, and not only limited to RDF. I think the usecase of running some network based on input read from existing or defined columns in an analysis is more than legitimate and quite interesting - let us know how this go for you.

Cheers,
D

Hi
To evaluate a model in C++ from the RDataFrame you can also use SOFIE, see root/tmva/sofie/README.md at master · root-project/root · GitHub

You can see some examples of using SOFIE in these tutorials:

You might need to save your pytorch model in ONNX format, that is support in pytorch. SOFIE supports also native pytorch input models, but the support is limited and it is reccomended to use as input an ONNX model.

Lorenzo