Evaluating MVA within Root Data Frame

ceballos · March 17, 2022, 4:17pm

Hello,

I’ve tried to find some information, but I couldn’t find any actual example. I am trying to evaluate a BDT output within RDF (it could be BDT or anything). I’ve already made the training and I have the xml files. I have used TMVA, if this is important to mention.

Nevertheless, how can I evaluate the BDT output? It’s not clear to me how this could be done.

Thanks, Guillelmo

eguiraud · March 17, 2022, 4:28pm

Hi @ceballos ,
basically you need a function or functor object that takes the column values as input and returns the classification output, let’s call it EvalBDT. Then you can use it as:

auto df_with_bdt_weight = df.Define("weight", EvalBDT, {"x", "y", "z"})

assuming x, y and z are your inference inputs.

TMVA has experimental interfaces to create such a functor object, probably @moneta can provide an example.

Cheers,
Enrico

moneta · March 17, 2022, 4:48pm

Hi,
yes we have the tutorial tmva003_RReader.C that is an example on how to evaluate a BDT using the nw TMVA RReader class and RDataFrame.

Cheers

Lorenzo

ceballos · March 18, 2022, 11:02am

Hi Enrico, Lorenzo,

thanks a lot for the pointers, it worked! I am using pyroot, it would probably good to have the examples, both in C++ and python, if possible. For the record, let me write below what I did, following the example suggested by Lorenzo (*)

Thanks again, Guillelmo

(*)

This is different w.r.t. C++ version

ROOT.gInterpreter.ProcessLine('''
TMVA::Experimental::RReader model("bdt_BDTG_vbfinc_v0.weights.xml");
computeModel = TMVA::Experimental::Compute<13, float>(model);
''')

This is just to make sure which variables we are using

variables = ROOT.model.GetVariableNames()
print(variables)

This is the actual application

.Define(“bdt_vbfinc”, ROOT.computeModel, ROOT.model.GetVariableNames())

setesami · April 4, 2024, 7:32pm

Hello Enrico,

I have the same issue for evaluating the MVA weight within Root dataframe. My input variables are stored as vector instead of being float. In the normal root I can pass each element of vector as float within the event loop to evaluate the MVA weight but here as we dont have visible loop over events. Is there any example for that?

Thanks
Mohsen