Home | News | Documentation | Download

Cross Validation results returning zero

Hey all,

I have been running TMVA using pyROOT for the past few months with no issues, however when I run a cross validation, the results called by using TMVA.CrossValidation’s GetResults() function return zeros for the ROC curve integrals. Example code and output is shown below:

    # Add trees to dataloader
    dataloader.AddSignalTree(signal,1.0)
    dataloader.AddBackgroundTree(background,1.0)

    dataloader.PrepareTrainingAndTestTree(ROOT.TCut(''),"nTest_Signal=1:nTest_Background=1:SplitMode=Random:"+\
             "NormMode=EqualNumEvents" )   #CV

    dataloader.SetWeightExpression( "(xSection*lumi*weight_mc)/runningWeightSum" )

    # cross validation
    cv = ROOT.TMVA.CrossValidation('CrossValidation',dataloader,'')


    cv.BookMethod(ROOT.TMVA.Types.kPyGTB, 'GTB','')
    cv.SetNumFolds(2)
    cv.Evaluate()
    cv_results = cv.GetResults()
    for i in range(0, len(cv_results)):
        cv_results[i].Print()

Output:

GTB                      : [dataset_pymva] : Loop over test events and fill histograms with classifier response...
                         : 
                         : 
                         : Evaluation results ranked by best signal efficiency and purity (area)
                         : -------------------------------------------------------------------------------------------------------------------
                         : DataSet       MVA                       
                         : Name:         Method:          ROC-integ
                         : dataset_pymva GTB            : 0.643
                         : -------------------------------------------------------------------------------------------------------------------
                         : 
                         : Testing efficiency compared to training efficiency (overtraining check)
                         : -------------------------------------------------------------------------------------------------------------------
                         : DataSet              MVA              Signal efficiency: from test sample (from training sample) 
                         : Name:                Method:          @B=0.01             @B=0.10            @B=0.30   
                         : -------------------------------------------------------------------------------------------------------------------
                         : dataset_pymva        GTB            : 0.028 (0.028)       0.232 (0.232)      0.494 (0.494)
                         : -------------------------------------------------------------------------------------------------------------------
                         : 
Factory                  : Thank you for using TMVA!
                         : For citation information, please visit: http://tmva.sf.net/citeTMVA.html
                         : Evaluation done.

CrossValidation          :  ==== Results ====
                         : Fold  0 ROC-Int : 0.0000
                         : Fold  1 ROC-Int : 0.0000
                         : ------------------------
                         : Average ROC-Int : 0.0000
                         : Std-Dev ROC-Int : 0.0000

Regards,

Jake

@swunsch can you take a look please?

Thank you,
Oksana.

Hi,

Sry for the late reply. Unfortunately I cannot spot the issue and @moneta is the CV expert (and on vacation). In case he cannot spot the issue, we can only help with a reproducible script!

Best
Stefan

Hi

Can you please post your full macro and your data, so I can reproduce the issue ? Looking just at the code, it seems fine

Thank you

Lorenzo