TMVA model evaluation in RDataFrames from RVecs

Dear RDF+TMVA experts,

I am trying to evaluate a TMVA model in an RDataFrame. I have managed to load and evaluate the model for an one-dimensional inputs, e.g. as follows:

RDataFrame df(...);
auto df2 = df.Define("f0_0", "f0[0]")
              ...
              Define("f12_0", "f12[0]");

RReader model("...");
auto df3 = df2.Define("model_out", Compute<13, float>(model), {"f0_0", ..., "f12_0"}

In my use case the f features are features of a vectorized input, say e.g. a list of muons which are all VecOps::RVec<float> types. So in the example above I always just evaluated the first entry (i.e. first muon), which works fine.

I am now trying to extend this to evaluate my model for the full vector of features, i.e. I would like my model_out branch to be again a VecOps::RVec<float> type. Is there a way to do this?

I also tried doing this with the RDF::RNode, but I am not sure how to wrap the Compute and RReader calls properly into that. Here is my failed attempt:

RNode evaluate_mva(RNode df, 
    const std::string &out_name,
    const std::string &model_file,
    const std::string &f0,
    ...
    const std::string &f12) {

    RReader model(model_file);
    auto evaluator = Compute<13, float>(model);
    auto o = df.Define(out_name, 

        [evaluator](floats &f0, ..., floats &f12) {

            floats values;
            for(int i=0; i<f0.size(); ++i)
            {
                float v = evaluator(f0[i], ..., f12[i]);
                values.push_back(v);
            }
            return values;
        }, {f0, ..., f12});
    return o;
}

Thanks already in advance,
Cheers,
Jan

Hi @vdlinden ,

and welcome to the ROOT forum!
Let me make sure I understand the setup: for every row in the TTree dataset you have 13 unidimensional arrays with equal size N and you want to perform inference so you get another array of size N out?

Is the model a neural network, BDT trees, or something else? Is it stored as an XML file?

Cheers,
Enrico

Dear Enrico,

apologize the sparseness of my initial post.

I have 13 arrays in my TTree dataset, which for each entry in the TTree have a common length N, which can vary between the different TTree entries.

The model I am trying to evaluate is originally a xgboost BDT .bin model that I converted to a xml file using this github gist [*].

In the end I want an output of size N, so yes, your summary was basically correct.

Cheers,
Jan

[*] I only have the .bin trained models available but as far as I know I cant directly load/evaluate them in TMVA+RDF, so I converted them into xml files. If there is a way around that, that would also be great, but maybe an issue for another thread.

Alright, then the code you posted looks ok to me – I could not spot any obvious issues. What error do you get exactly?

Cheers,
Enrico

The error I get during compilation is a bit cryptic to me, it appears to be some problem with the type of the variables passed to the Compute function?

error: no match for call to '(const TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12>, float, TMVA::Experimental::RReader&>) (__gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&, __gnu_cxx::__alloc_traits<ROOT::Detail::VecOps::RAdoptAllocator<float>, float>::value_type&)'
  126 |              float v = evaluator(f0[i], f1[i], f2[i], f3[i], f4[i], f5[i], f6[i], f7[i], f8[i], f9[i], f10[i], f11[i], f12[i]);
      |                                                                                                                              ^

...
[in] /root/6.24.07-bf41b0420bc269850b74e23486e2953a/include/TMVA/RInferenceUtils.hxx:24:9: note: candidate: 'decltype (((TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>*)this)->TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>::fFunc.Compute({TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>::operator()::args ...})) TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>::operator()(TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>::AlwaysT<N>...) [with long unsigned int ...N = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12}; T = float; F = TMVA::Experimental::RReader&; decltype (((TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>*)this)->TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>::fFunc.Compute({TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, _Idx ...>, T, F>::operator()::args ...})) = std::vector<float, std::allocator<float> >]' (near match)
   24 |    auto operator()(AlwaysT<N>... args) -> decltype(fFunc.Compute({args...})) { return fFunc.Compute({args...}); }
      |         ^~~~~~~~
[in] /root/6.24.07-bf41b0420bc269850b74e23486e2953a/include/TMVA/RInferenceUtils.hxx:24:9: note:   passing 'const TMVA::Experimental::Internal::ComputeHelper<std::integer_sequence<long unsigned int, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12>, float, TMVA::Experimental::RReader&>*' as 'this' argument discards qualifiers

For reference I am using ROOT 6.24/07.
I am including the following things in my script:

#include "ROOT/RDataFrame.hxx"
#include "ROOT/RVec.hxx"
#include "TChain.h"
#include "Math/Vector4D.h"
using namespace ROOT;
using namespace ROOT::VecOps;
using RNode = ROOT::RDF::RNode;
#include "TMVA/RReader.hxx"
#include "TMVA/RInferenceUtils.hxx"
using namespace TMVA::Experimental;
#include "correction.h"
using correction::CorrectionSet;
#include <iostream>
#include <fstream>
using doubles = ROOT::VecOps::RVec<Double_t>;
using floats = ROOT::VecOps::RVec<float>;
using bools = ROOT::VecOps::RVec<Bool_t>;
using ints = ROOT::VecOps::RVec<Int_t>;
using chars = ROOT::VecOps::RVec<unsigned char>;

And the command to compile is

g++ $(root-config --cflags --ldflags --libs) -lMLP -lMinuit -lTreePlayer -lTMVA -lTMVAGui -lXMLIO  -lMLP -lm

Ah, I see! The compilation error you get is caused by a C++ quirk: [evaluator] captures the variable by value in the lambda, and by default variables captured by value are const. Calling evaluator(f0[i], ...) then errors out because it cannot be called on a const evaluator.

If you declare the lambda as mutable like this:

[evaluator](floats &f0, ..., floats &f12) mutable {

the compilation error should go away.

I’d like to suggest a possibly simpler approach, I’ll try to type it out tomorrow :slight_smile:
@moneta might also have suggestions.

Cheers,
Enrico

Dear Enrico,

thank you, that seems to at least resolve that compliation issue!
After that another issue popped up with the output type of the evaluator call:

error: cannot convert 'std::vector<float, std::allocator<float> >' to 'float' in initialization
  134 |                 float v = evaluator(f0[i], ...);
      |                           ~~~~~~~~~^~~~~~~~~~~
      |                                    |
      |                                    std::vector<float, std::allocator<float> >

I dont really understand why this is not a simple float type?

I changed the type of v to std::vector<float, std::allocator<float> > for testing, but now I get some other errors related to the evaluation in that function:

RDataFrame::Run: event loop was interrupted
terminate called after throwing an instance of 'std::runtime_error'
  what():  Size of input vector is not equal to number of variables.

This is a bit confusing, as the single evaluation via

Define("model_out", Compute<13, float>(model), {"f0_0",...});

still works without problems, and also produces a float type column in my RDF.

Besides these issues I am of course always open to simpler / more efficient solutions :wink:

Cheers,
Jan

Hi Jan,

I would need a self-contained reproducer to debug this last issue.

As for the simpler solution: for evaluating xgboost models you can use RBDT, as shown here: ROOT: tutorials/tmva/tmva102_Testing.py File Reference (the input file is created in ROOT: tutorials/tmva/tmva101_Training.py File Reference ). If you can share an example input file that contains a trained xgboost model I can try to cook up an example that pairs RBDT with RDataFrame.

Cheers,
Enrico

Dear Enrico,

excuse the delayed reply. I prepared a self-contained example here [1]:

  • generate dummy TTree with python3 make_tree.py, which just contains one vectorized branch f0 and its length n
  • two cc files run_rdf.cc and run_rdf_0.cc which contain dummy examples of the evaluation of the xml model, the first tries to solve the problem at hand, while the _0 is the evaluation of only the first entry of f0 to cross check that the model is evaluatable at all
  • I compile these scripts (as mentioned above) via
g++ $(root-config --cflags --ldflags --libs) -lMLP -lMinuit -lTreePlayer -lTMVA -lTMVAGui -lXMLIO  -lMLP -lm
  • running the rdf step then produces a mva_out.root file via
./a.out dummy_tree.root

The model is available as model.weights.xml which I converted as described above. The original file is model.weights.bin. So if this problem can be solved by using only the bin file, that would be even better.

I use ROOT 6.24/07.

Thank you again for your help,
Cheers,
Jan

[1] CERNBox

Hi @vdlinden ,

sorry for the high latency, busy days! I will take a look as soon as possible.

Cheers,
Enrico

Hi @vdlinden ,

Thank you very much for the reproducer and for your patience.
run_rdf_0.cc compiles and runs fine (as expected, I guess).

The problem with run_rdf.cc is that RReader::Compute (which is invoked by the Compute helper, returning its result) returns a std::vector in general. If, for your specific model, you know that the vector has size 1 (because the output of the model is a scalar). This is what causes the compilation error:

run_rdf.cc:51:36: error: cannot convert ‘std::vector<float, std::allocator<float> >’ to ‘float’ in initialization
   51 |                 float v = evaluator(f0[i], f1[i], f2[i], f3[i], f4[i], f5[i], f6[i], f7[i], f8[i], f9[i], f10[i], f11[i], f12[i]);
      |                           ~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                    |
      |                                    std::vector<float, std::allocator<float> >

You should change:

float v = evaluator(...);

to e.g.

float v = evaluator(...)[0];

There is also a typo chars &f2 that should be floats &f2 in the signature of the lambda passed to Define.

With these fixed the only problem remaining is a pesky lifetime issue. In auto evaluator = Compute<13, float>(model), evaluator internally references model but model goes out of scope at the end of the evaluate_mva function so when evaluator(...) is invoked it’s using a model that has already been deleted, leading to problems. Possible workarounds for this latest issue are to either A. return the model together with the out dataframe from evaluate_mva, or B. move the body of evaluate_mva into main so that model remains in scope for the duration of the event loop.

This should not be this complicated and I’m looking at what we can do about it. Stay tuned :slight_smile:

Cheers,
Enrico

Dear Enrico,

thanks a lot for these further investigations.
After adding the [0] for the evaluator call I can successfully compile the script, but get an error

RDataFrame::Run: event loop was interrupted
terminate called after throwing an instance of 'std::runtime_error'
  what():  Size of input vector is not equal to number of variables.

For my understanding (as the error message isnt really detailed), is this related to the lifetime issue you mentioned or something else?

Cheers,
Jan

Hi,

yes with a debugger one can see that the cause of the error there is evaluator reading a bogus size for the model input layer. And it’s reading a bogus size because model has gone out of scope.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.