How to extract optimal cuts value

Dear Kim,

I am using TMVA BDT method to discriminate between signal and background. I followed the instructions and it looks working well. Then, I want to extract the optimal cuts which correspond to the maximum of BDT score value, so I used the command
factory->GetMethod(“dataset”, “BDT”);
TMVA::MethodBase * method = dynamic_cast<TMVA::MethodBase *>(imethod);

Double_t cut = method->GetSignalReferenceCut();

Then, it gives me a xml file (called TMVAClassification_BDT.weights.xml) that contains around 100 sets of cuts intervales. Could you please tell me how to get cuts set that corresponds to the maximum of BDT score values? is there a command that gives automatically only the maximum of BDT score value and the corresponding cuts?

Although, even I get the output file “TMVAClassification_BDT.weights.xml”(attached file)TMVAClassification_BDT.weights.xml.tar.gz (98.1 KB) , it seems that something is crashing since I get the following message at the end of the code runing.


 *** Break *** segmentation violation

===========================================================
There was a crash.
This is the entire stack trace of all threads:
===========================================================
#0  0x00007f99928c40fa in __GI___waitpid (pid=26803, stat_loc=stat_loc
entry=0x7ffdae8aba40, options=options
entry=0) at ../sysdeps/unix/sysv/linux/waitpid.c:29
#1  0x00007f999283cfcb in do_system (line=<optimized out>) at ../sysdeps/posix/system.c:148
#2  0x00007f99948591d2 in TUnixSystem::Exec (shellcmd=<optimized out>, this=0x9bf3f0) at /home/hilal/Work/root-6.18.02/core/unix/src/TUnixSystem.cxx:2106
#3  TUnixSystem::StackTrace (this=0x9bf3f0) at /home/hilal/Work/root-6.18.02/core/unix/src/TUnixSystem.cxx:2400
#4  0x00007f999485bac3 in TUnixSystem::DispatchSignals (this=0x9bf3f0, sig=kSigSegmentationViolation) at /home/hilal/Work/root-6.18.02/core/unix/src/TUnixSystem.cxx:3631
#5  <signal handler called>
#6  0x00007f9994bcc03e in TMVA::MethodBase::GetSignalReferenceCut() const () from /home/hilal/Work/CouplingProject/HiggsGBD-bbE/source/libHiggsGBD.so
#7  0x00007f9994bcb974 in TMVAClassification::addMethod() () from /home/hilal/Work/CouplingProject/HiggsGBD-bbE/source/libHiggsGBD.so
#8  0x000000000040141d in main ()
===========================================================

The lines below might hint at the cause of the crash.
You may get help by asking at the ROOT forum http://root.cern.ch/forum
Only if you are really convinced it is a bug in ROOT then please submit a
report at http://root.cern.ch/bugs Please post the ENTIRE stack trace
from above as an attachment in addition to anything else
that might help us fixing this issue.
===========================================================
#6  0x00007f9994bcc03e in TMVA::MethodBase::GetSignalReferenceCut() const () from /home/hilal/Work/CouplingProject/HiggsGBD-bbE/source/libHiggsGBD.so
#7  0x00007f9994bcb974 in TMVAClassification::addMethod() () from /home/hilal/Work/CouplingProject/HiggsGBD-bbE/source/libHiggsGBD.so

I want to know if this (GetMethod) is the right option to use? if yes where am I doing such mistake?

Best regards,

Hilal

@moneta can you help here or perhaps redirect to someone that can help?

I’m not sure @kialbert still answers questions here…

Hi,

I’m still around occasionally, although less than what I used to.

Hilal, the code you posted should work (assuming the missing part amounts to something like).

auto imethod = factory->GetMethod("dataset", "BDT");
auto method = dynamic_cast<TMVA::MethodBase *>(imethod);

The following line will then output the cut value of the classifier output where the signal efficiency equals the background rejection.

Double_t cut = method->GetSignalReferenceCut();
std::cout << "Optimal cut: " << cut << std::endl;

The .xml file is not the output you are looking for, rather that is the text form of the entire BDT.

The crash you are experiencing should not happen and could be because (one of) the following lines are missing:

factory->TrainAllMethods();
factory->TestAllMethods();
factory->EvaluateAllMethods();

If all those lines are present, and you still experience the crash I need to see more of the output and your code.

Cheers,
Kim

Dear Kim,

Thank you very much for your prompt reply.

The crash has disappeared now.

The command,

gives the value (-0.017) where the signal and background are equal in the BDT response plot (attached), so I am now confused since I understood the optimal cuts should correspond to the maximum of the signal curve ( around 0.05) (in the attached BDT response) am I right?

I want to ask also if there exists a command that gives the interval cuts of the kinematic variables explicitly for a given value of the BDT response (whether it corresponds to maximal signal curve or the equality signal/background)?

Cheers,

Hilal.

Hi,

Sorry, I must have been confused in my previous answer. I rechecked the code just now, and the reference cut corresponds to the point after which the signal probability is higher than background probability, so -0.017 in your attached plot.

This is meant as a quick scan for a reasonable value of the cut.

For finding the cut corresponding to the peak of the signal curve you can extract the TH1 distribution and use GetMaximumBin(). (The reference cut is calculated using this histogram).

The histogram can be found in the output root file. For a classifier named e.g. “myBDT” the corresponding histogram will be found in outputfile.root:dataset/Method_BDT/myBDT/MVA_myBDT_S.

Cheers,
Kim