Extract BDT score of training and testing samples

Einsiedler · April 15, 2019, 12:44pm

Dear TMVA users,

I was wondering if there was any simple way to extract the BDT Score of the training and testing samples in TMVAClassification.C to a txt file (for example).

Thanks in advance

moneta · April 15, 2019, 3:11pm

Hi,

The values of the BDT score are stored in a ROOT file (“TMVA.root”). With a very simple Python (or C++) code you should be able to extract the information and write in a text file. If you need an example, I could provide for you

Lorenzo

Einsiedler · April 15, 2019, 3:26pm

Thank you Lorenzo, it looks like I can easily extract that information from the .root file resulting from the training. In the meantime, I also manage to simply apply the .class.C file along with the .xml to the samples I used with GetMvaValue and I found the same values and histograms. Thank you very much for your suggestion

anprodri · September 10, 2019, 3:36pm

Hi!

I’m really lost about this. Could you show me the script for reading the BDT score? I would really appreciate it.

Kind regards.

kialbert · September 10, 2019, 5:41pm

Hi,

I’ll provide a few simple examples to help you start out. The ROOT examples are for the interactive prompt but can easily be converted to stand-alone scripts.

This assumes you are looking at the tutorial TMVA.root (generate by running root -l TMVAClassifcation.C in your $ROOTSYS/tutorials/tmva directory.) Modify to fit your usecase, change “BDTG” to the name of your classifier e.g.

If you’re using ROOT directly:

root [0] reader = TTreeReader("dataset/TrainTree", TFile::Open("TMVA.root"));
root [1] reader_bdtg = TTreeReaderValue<Float_t>(reader, "BDTG");
root [2] reader.SetEntry(0);
root [3] *reader_bdtg.Get()
(float) 0.976866f

or,

root [0] reader = TTreeReader("dataset/TrainTree", TFile::Open("TMVA.root"));
root [1] reader_bdtg = TTreeReaderValue<Float_t>(reader, "BDTG");
root [2] while (reader.Next()) {std::cout << *reader_bdtg.Get() << std::endl;}
// Entire tree will be printed :)

If you’re using python:

import ROOT
f = ROOT.TFile.Open('TMVA.root')
t = f.Get('dataset/TrainTree')

# Print a single entry:
t.SetEntry(0)
t.BDTG

# Print a all entries:
for entry in tree:
    print(entry.BDTG)

Further references:

ROOT
- TTreeReader tutorial – check this first, the others on demand.
- TTree tutorials
- TTreeReader class reference
- ROOT Users’ Guide on reading TTrees
python
- Reading a TTree with pyROOT – check this first, the others on demand.
- pyROOT tutorials
- ROOT Users’ Guide on reading TTrees with pyROOT
General
- ROOT tutorials

Cheers,
Kim

anprodri · September 10, 2019, 7:04pm

Thank you so much! It worked. Now, I was wondering if there’s a way of getting the Kolmogorov-Smirnov test as an output value in Python. Do you have any idea?

Kind regards.

kialbert · September 11, 2019, 2:30pm

TMVA uses the function found here to perform the test.

Usage, something like so:

TFile * file = TFile::Open("TMVA.root");
TString base_path = "dataset/Method_BDT/BDTG/";
TH1 * sig_test  = file->Get<TH1>(base_path + "MVA_BDTG_S");
TH1 * sig_train = file->Get<TH1>(base_path + "MVA_BDTG_Train_S");
Double_t kol_sig = sig_test->KolmogorovTest(sig_train, "X");

std::cout << "K-S score: " << kol_sig << std::endl;

Cheers,
Kim

anprodri · September 13, 2019, 5:45pm

Thank you so much! It was really helpful.

Have a nice weekend!

anprodri · September 24, 2019, 12:59pm

Hi again! I have another question.

When I run the K-S test using:
sig_test->KolmogorovTest(sig_train, “X”)
I assume this is the unbinned case, and considering this assumption, if the returned value is close to zero, would that mean that the distributions are similar?

kialbert · September 24, 2019, 5:01pm

Hi,

Just an FYI: We would prefer that you create a new topic when there is a solution provided in a previous one.

Now: The statistic for the “X” option ranges from 0 to 1 where 1 means the two distributions are equal.

You can test this yourself like so:

root [0] a = TH1F{"a", "a", 20, -10., 10.};
root [1] b = TH1F{"b", "b", 20, -10., 10.};
root [2] c = TH1F{"c", "c", 20, -10., 10.};
root [3] a.FillRandom("gaus", 10000);
root [4] b.FillRandom("gaus", 10000);
root [5] c.FillRandom("expo", 10000);
root [6] a.KolmogorovTest(&a, "X")
(double) 1.0000000
root [7] a.KolmogorovTest(&b, "X")
(double) 0.93600000
root [8] a.KolmogorovTest(&c, "X")
(double) 0.00000000

Cheers,
Kim

anprodri · September 26, 2019, 3:24pm

I understand now. Thanks a lot!

Have a nice week.