RooFit External/Internal Constraints Logic

Hello all,

Following this tutorial here: ROOT: tutorials/roofit/rf604_constraints.C Source File

It looks like we can define constraints in two ways, an external constraint when fitting:

RooFitResult* r3 = model.fitTo(*d,ExternalConstraints(fconstext),Save()) ;

Or an internal constraint multiplying the pdf and then providing Constraint to fitTo method.

RooFitResult* r2 = modelc.fitTo(*d,Constrain(f),Save()) ;

My question is, does this work when trying to fold in the systematic uncertainties to the model and
what does providing Constraint do if we have a separate Gaussian PDF defined already and we multiply it with the model pdf.

If I define my gaussian to include systematics like so:

### Define gaussian constraint used to include systematics on yields 
w.factory("Gaussian::norm_constant_constraint(1, norm_constant[1, 0, 2], norm_constant_sigma[0.1]), ") # Sigma here will be fixed to syst uncertainties 
w.var("norm_constant_sigma").setVal(pcmodel[args.pc]['systematic'])
w.var("norm_constant_sigma").setConstant()

And then my fit model like:

w.Import(ROOT.RooRealVar("frac", "frac", 0, 0, 0.1)) # signal yield variable
w = buildmodel(w = w, components = [f"SUM::cmodel(bkg,frac*signal)"])
w = buildmodel(w = w, components = [f"PROD::model(cmodel, norm_constant_constraint)"]  )        
w.pdf("model").fitTo(w.data("data"), ROOT.RooFit.Constrain(ROOT.RooArgSet(w.var("norm_constant"))), ROOT.RooFit.AsymptoticError(ROOT.kTRUE), ROOT.RooFit.Save())

Am I doing it correctly? Also, I don’t understand why I need to call Constrain method again if I already am multiplying the PDFs.

This constraint should introduce extra uncertainty in the full model (and unlike other situations it shouldn’t be a function of any of the existing parameters but an extra nuisance parameter)

So my logic is I modify the likelihood itself by some term which is like a factor accounting for the combined systematic uncertainties.

Hi @Mindaugas_Sarpis,

What you do is correct! It’s equivalent to add your constraint as a factor to any RooProdPdf/PROD, or to pass it as ExternalConstraints().

You also don’t need the Constrain() argument, it is redundant and I would even like to discourage people from using it in the documentation at some point. Any parameters for which there is a pdf that doesn’t depend on the observed dataset is by definition a constrained parameter.

The only thing you need to still consider in addition to what you do is to pass the GlobalObservables() command argument. It represents the set of observables to normalize over when evaluating your constraint PDFs. By default, this normalization set are the constrained parameters, which are figured out automatically as I said before. For your Gaussian constraint it usually doesn’t matter what you normalize over, since the integral over x or mu is the same, but in your case it actually makes a difference! But in general, e.g. for Poisson/Gamma constraints, it can make a difference. So it’s a good habit to explicitly define the global observables, like it’s done also internally in HistFactory for example. In your case, it could look like this:

Gaussian::norm_constant_constraint(norm_constant_nominal[1,0,2],
                                   norm_constant[1, 0, 2],
                                   norm_constant_sigma[0.1]))

norm_constant_nominal needs to be set constant, and then you pass it together with the other eventual global observables to GlobalObservables() in fitTo().

Here is another nice forum post that explains what “global observables” mean:

Cheers,
Jonas

PS: I wrote more about this topic also in this forum post:

Sounds good! Thank you for the explanation.

A follow up:

If I define my external constraint as RooProdPDF but I want to add a constraint not
on the entire pdf but on a fit fraction f in my case. Is it correct to multiply the fit fraction
by the floating mean from this constraint I introduced?

Like so:

        w.Import(ROOT.RooRealVar("frac", "frac", 0, 0, 0.1)) # signal yield variable
        w.Import(ROOT.RooFormulaVar("pcfrac", "frac*norm_constant", ROOT.RooArgList(w.var("frac"), w.var("norm_constant"))))
        w = buildmodel(w = w, components = [f"SUM::noconstmodel(bkg,pcfrac*signal)"])
        w = buildmodel(w = w, components = [f"PROD::model(noconstmodel, norm_constant_constraint)"]  )        
        w.pdf("model").fitTo(w.data("data"), ROOT.RooFit.Constrain(ROOT.RooArgSet(w.var("norm_constant"))), ROOT.RooFit.AsymptoticError(ROOT.kTRUE), ROOT.RooFit.Save())

Well, in general you can apply the constraint to any parameter!

But be careful there, because in your snipped you are having two degenerate parameters frac and norm_constant, of which only norm_constant is constrained. So your constrant has no effect here, because there is another floating parameter frac that can model the same scaling but doesn’t suffer the penalty from a constraint.

If frac and norm_constant are also appearing in different parts of a model in different ways such that they are not 100 % correlated, it would make sense though!

Cheers,
Jonas

I was testing this by changing the sigma parameter on the gaussian which has this norm_constant as a floating mean. I get what you are saying that yield itself (before it’s multiplied by the constant) is floating. But this is the result I get by applying this. I increase systematic uncertainty (sigma) from 0 to 1 (so up to a 100%) and I get what I think is expected behavior.

Likelihood_ECCpi0_Pc4440_kde+voigt_yield.pdf (20.2 KB)

Ah, also, frac was supposed to be unconstrained. But the value returned by the fit will be different depending on whether this norm_constant is present or not. So yes in fact they are correlated but
this is exactly how I thought statistical uncertainty would work. If the freely floating norm_constant
increases, I fit a smaller value for frac if it decreases - larger.

Yes, nice, if you are interested about the uncertainties then the correlations are taken into account and all is good!

I should have been more precise, what I wanted to warn about is that since you have degenerate parameters and there is only the constraint to “break the symmetry”, the nominal values of your results will not necessarily be what you are expecting in the usual case. For example if you look at the pulls for norm_constant, it will not be distributed according to the usual Gaussian but just a delta function, because due to the degeneracy the minimization can always bring the norm_constant to the nominal value of 1.0. But probably I was overthinking this, sorry for the unnecessary worry :smiley:

Cheers,
Jonas

Another follow-up:

I have a model which mathematically looks like this (in RooFit we only have one f_{sig} and the background fraction is already deduced from that. \textcolor{red}{\mu_\sigma} is a nuisance
parameter I introduced which carries the uncertainties described before. But these uncertainties are computed on the f_{sig} fit fraction. I want to set upper limits on this but also on \mathcal{R} which is basically taking f_{sig} and multiplying it by branching fraction of a separately measured result. And I want to account for uncertainties on this branching fraction. If I add another constraint and another factor (this factor would multiply the entire PDF, not just f_{sig}) wouldn’t it just scale my pdf and be normalized out? Also, If I say that my total number of events can change with some uncertainty how do I address this in CLs toy generation step? (it’s probably not ok using the same fixed number of candidates per toy which I have in my data if this is varied with some uncertainty).

\mathcal{L}(\theta) = \displaystyle{\prod_{i=1}^n} ([1-\textcolor{red}{\mu_\sigma}\cdot f_{sig}] \cdot \mathcal{P}_{BKG_{KDE}} + \textcolor{red}{\mu_\sigma} \cdot f_{sig} \cdot \mathcal{P}_{Voigt}(\xi ; \mu,\Gamma,\sigma_{res}) ) \cdot \mathcal{G}(1;\textcolor{red}{\mu_\sigma},\sigma_{syst})

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.