RReader crashes with TMVA::BDT xml file

Dear experts,

I would like to use RDataFrame to evaluate my MVA model. There are amazing tutorial shows how to apply with the modern interfaces models saved in TMVA XML files.

The example works perfectly, but when I try to use my xml weights file it crashes:

<FATAL>                          : The expression declared to the Reader needs to be checked (name or order are wrong)
***> abort program execution

My XML file available on the CernBox.

PS. Are there any plans to add pyROOT interface for the RReader?


ROOT Version: 6.19.01 (master branch)
Platform: MacOS
Compiler: Not Provided


Hi!

I’m happy that you found the new reader interface and you like it! However, disclaimer: it’s still in experimental stage and ought to be made accessible for productive use with ROOT 6.20 (somewhen end of the year).

But I’m going to look in the issue tomorrow and debug the cause :slight_smile:

Aaaaand we plan to add nicer python bindings understanding for example numpy arrays natively.

Best
Stefan

2 Likes

Hi again!

There we go: https://github.com/root-project/root/pull/4137

You have used expressions in the training and the reader has to be booked with the exact same expressions, otherwise the setup fails. Little bit strange since the information about the expressions are not of any interest for the reader (?) Unfortunately, I haven’t thought of this in the current implementation.

As long as you don’t compile ROOT by yourself, the fix won’t propagate easily/quickly to you, but the changes are only a few lines (see the PR above).

Or if you prefer the hacky fix, just change the XML config so that the TMVA::Reader does not complain anymore (see the changed Expression and Title fields):

  <Variables NVar="6">
    <Variable VarIndex="0" Expression="var1" Label="lep_0_p4_fast.Pt()" Title="var1" Unit="GeV" Internal="lep_0_p4_fast.Pt__" Type="F" Min="2.10000343e+01" Max="4.20785840e+03"/>
    <Variable VarIndex="1" Expression="var2" Label="lep_0_p4_fast.Eta()" Title="var2" Unit="" Internal="lep_0_p4_fast.Eta__" Type="F" Min="-2.49804258e+00" Max="2.49988317e+00"/>
    <Variable VarIndex="2" Expression="var3" Label="lep_0_p4_fast.Phi()" Title="var3" Unit="" Internal="lep_0_p4_fast.Phi__" Type="F" Min="-3.14159060e+00" Max="3.14158702e+00"/>
    <Variable VarIndex="3" Expression="var4" Label="fabs(lepmet_dphi)" Title="var4" Unit="" Internal="fabs_lepmet_dphi_" Type="F" Min="3.90124296e-06" Max="3.14159274e+00"/>
    <Variable VarIndex="4" Expression="var5" Label="met_reco_p4_fast.Et()" Title="var5" Unit="GeV" Internal="met_reco_p4_fast.Et__" Type="F" Min="4.71710302e-02" Max="4.66591406e+03"/>
    <Variable VarIndex="5" Expression="var6" Label="met_reco_p4_fast.Phi()" Title="var6" Unit="" Internal="met_reco_p4_fast.Phi__" Type="F" Min="-3.14158440e+00" Max="3.14158916e+00"/>
  </Variables>

Let me know if I can help you further!

Best
Stefan

And I should point out that the old TMVA::Reader is not thread-safe. Therefore, we put guards around the Reader::GetMVAValue calls so that you won’t see wrong results (or a segfault) at runtime. Unfortunately, this has the implication that you cannot use multi-threading effectively.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.