RooFit createNll gives result that is too small

Dear All,

I am performing a simultaneous fit for signal strength mu. When I perform a fit with RooAbsPdf::createNll(), I often get a result that blows up with mu getting too largely negative. For example, as shown in these pre-fit and post-fit plots, and a standalone example at the bottom. Could anyone help explain what might cause this? Thanks very much.

Before the fit:
slice-vbfLoose-prefit

After the fit
slice-vbfLoose-postfit

I have a standalone workspace with the pdf and roodatahist that I am using to make these attached.

Root file with workspace:

workspace.root (35.9 KB)

And a standalone script similar to how I am creating the NLL:

from __future__ import division
import sys, copy
sys.argv.append( '-b' )
from ROOT import TFile
from ROOT.RooFit import *
import ROOT
from ROOT import *

################################################################################
# standalone script for nll fit
################################################################################


# workspace
f = TFile.Open("workspace.root")
w = f.Get("w") # workspace
x = f.Get("x") # main variable
mu= f.Get("mu") # floating signal strength

# get rdh and pdf from workspace
# this is a RooDataHist with ONE "category" entry for simplicity
rdh = w.obj("simRdh")
# this is a RooSimultaneous with ONE category, and has ONE extended pdf added
pdf = w.obj("simPdf")

# create nll
nll = pdf.createNLL(rdh,RooFit.NumCPU(4))
ROOT.RooMinuit(nll).migrad()

Without knowing what your Pdf actually is, it’s hard to understand whether what RooFit does actually makes sense or not

Hi Giaccaria,

Thank you for the reply. The PDF is in the attached workspace and can be viewed in a ROOT session via:

w->obj("simPdf")->Print()
RooSimultaneous::simPdf[ indexCat=channelName nonCentralHighPt=addPdfNonCentralHighPt vbfTight=addPdfVbfTight vbfLoose=addPdfVbfLoose CentralLowPt=addPdfCentralLowPt CentralMidPt=addPdfCentralMidPt nonCentralLowPt=addPdfNonCentralLowPt CentralHighPt=addPdfCentralHighPt nonCentralMidPt=addPdfNonCentralMidPt ]

Each of the index PDFs is a sum of mu*signal + background components, something like:

w->obj("addPdfNonCentralHighPt")->Print()
RooAddPdf::addPdfNonCentralHighPt[ nonCentralHighPt_nBkg * bkgPdfNonCentralHighPt + nSigCoefNonCentralHighPt * sigPdfNonCentralHighPt ] = 0.00846833

The signal model is a gaussian*CB, the background model is an exponential. The part of the code where I make the models is here:

            # signal model
            self.addStr("BW_mean_sig{0}[125,121,129]".format(firstUpper(chan)))
            self.addStr("BW_mean_sig_gaus{0}[125,123,127]".format(firstUpper(chan)))_
            self.addStr("RooCBShape::sigCB{0}(x, BW_mean_sig{0}, CB_sigma_sig{0}[2,1,4], CB_a_sig{0}[1.75,1,2], CB_n_sig{0}[1.])".format(firstUpper(chan)))
            self.addStr("RooGaussian::sigGF{0}(x, BW_mean_sig_gaus{0}, BW_sigma_sig{0}[5,1,6])".format(firstUpper(chan)))
            self.addStr("SUM:{0}Pdf(sigCB{1}, frac_sig{1}[0.3,0, 0.4]* sigGF{1})".format(sigPdfName,firstUpper(chan)))                                                                    
            # extend and fix range
            sigPdf = RooExtendPdf(sigPdfName,sigPdfName,self.get(sigPdfName+"Pdf"),self.get(chan+"_nSig"),"bkgFitAll")
            sigPdf.fixAddCoefNormalization(RooArgSet(x),True)
            sigPdf.fixAddCoefRange("fullRange",True)
            # background model
            self.addStr("RooVoigtian::BW_bg{0} (x, mBW{0}[91.2], ZWidth{0}[2.49], sigma{0}[2])".format(firstUpper(chan)))_
            self.addStr("EXPR::exp_bg{0}('exp(a2_bg{0}*(x/1))*(1./pow((x/1), a3_bg{0}))',   x, a2_bg{0}[0,-1,1],a3_bg{0}[2,0,25])".format(firstUpper(chan)))
            self.addStr("SUM::{0}Pdf( BW_bg{1} , frac_bg{1}[0.1,0,1] * exp_bg{1} )".format(bkgPdfName,firstUpper(chan)))_
            # extend and fix range
            bkgPdf = RooExtendPdf(bkgPdfName,bkgPdfName,self.get(bkgPdfName+"Pdf"),self.get(chan+"_nBkg"),"bkgFitAll")
            bkgPdf.fixAddCoefNormalization(RooArgSet(x),True)
            bkgPdf.fixAddCoefRange("fullRange",True)

This issue is still there, if anyone has some idea what causes it any input would be helpful.
Thanks

Hi @aaronsw,

it’s hard to say after having a first look. To better understand the PDF, I used
simPDF->graphVizTree("graph.dot")
giving this:

Is this the PDF you wanted?

Since it’s a simPDF, I would suggest to take out categories and fit them one-by-one. You might find that one of the categories has a bizarre feature that pulls mu down, completely dominating the fits in the other categories.

Hi @StephanH,

Thanks very much for the reply. Yes, this is the PDF that I want to fit.

The example of the fits from the top post are actually made with a single category. It’s set up in the simPDF, but there’s only one addPdf and category, so all the data used in the fit is visible in those plots.

After having a second look, I wonder if the PDF does the desired thing: Isn’t there a degeneracy between mu and vbfLoose_nSig?
You extend the signal PDF with nSig, but you also multiply with the product mu * nSig, yielding mu * nSig^2. That should create a degeneracy because the coefficients are 100% anti-correlated.

What about doing the following:
Don’t extend the PDFs because the RooAddPdf can do that for you. See e.g.
https://root.cern.ch/doc/master/classRooAddPdf.html#acafcca576f7839c046bea7c9edf31c22

That is, replace the extended PDFs in the third level of the tree by the RooAddPDFs from the fourth, and put the coefficients directly into the constructor as shown in the documentation.

Next, reparametrise the coefficient of the signal PDF as such:
c = mu/N_sig
and set N_sig constant. In this way, the fitter actually measures mu. Alternatively, measure N_sig directly.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.