Dataloader GetCorrelationMatrix

Hi,

I’m attempting to use h2D = dataloader.GetCorrelationMatrix(“Signal”)

to use the TMVA functionality to draw a correlation matrix only

This is currently returning a null pointer for me

dataloader = TMVA.DataLoader(weightPath+“HighMass/dataset_”+weightfilename+“”);
for (varname,vartype) in trained_variables:
dataloader.AddVariable(varname,vartype)
#load signal
dataloader.AddSignalTree(signal[“NominalFixed”], 1.0) #2nd argument is a weight
#load backgrounds
for bkg in backgrounds:
dataloader.AddBackgroundTree(backgrounds[bkg][“NominalFixed”],1.0)
analysiscuts=ROOT.TCut(cuts)
dataloader.SetBackgroundWeightExpression( AnalysisWeights );
dataloader.SetSignalWeightExpression ( AnalysisWeights );
h2D = dataloader.GetCorrelationMatrix(“Signal”)
print (“-----------------------------”)
print (h2D)
print (“-----------------------------”)
from PlotMaker import PlotMaker
savepath=Driver.savepath
pltm=PlotMaker()
pltm.setSavePath(savepath+“/”+loaded_samples.name+“/2D/”)
pltm.setupPlot(“signal”)
pltm.drawTopPad(lambda x: h2D.Draw(“colz text”))
pltm.update()
pltm.saveAs([“.pdf”])
outputFile.Close()

If anybody has any advice I would be very grateful

I’ve attached my full codeTMVAhelp.py (9.6 KB)

1 Like

I guess @moneta or @kialbert can help you.

Hi, This could be a sign that the dataset is not generated yet. The dataset is generated lazily upon request (but apparently not all paths are covered). Thus you need to explicitly trigger the generation before requesting the correlation matrix.

Add these lines before your call to GetCorrelationMatrix:

d.PrepareTrainingAndTestTree(ROOT.TCut("<global sig cut>"),
                             ROOT.TCut("<global bkg cut>"),
                             "<additional options (can be empty)>")
d.GetDataSetInfo().GetDataSet()

The first of these 2 statements defines the splitting into training and test sets (see the TMVA tutorial for more info). The second statement then triggers the generation of the dataset.

Cheers,
Kim