Upper Limit calculation

avenkate · June 12, 2019, 9:04pm

Dear experts,

I am attempting to use RooStats to set an upper limit on a parameter, R (which is a ratio of efficiency corrected yields).

I have built a fit model using RooFit, and the fit is parametrized in terms of R.

I have expressed my systematic uncertainties in the form of Gaussian constraints on parameters. For example, an efficiency that has been evaluated from MC gets a Gaussian constraint.

I am very confused as to which calculator I should use to find the upper limit. There are so many options like the Asymptotic Calculator, the Feldman Cousins calculator, Profile Likelihood Calculator etc. Further, different calculators give me very different answers.

There are also lots of choices for the test statistic and I am not sure which one I should use to set an upper limit. I find myself not knowing much about the statistics underlying these choices.

I am concerned about whether these calculators properly take into account the systematic uncertainties in my analysis. Naively, I would expect that the upper limit result would increase if I increase the systematics by hand, but I find that to not be the case when I try to use the asymptotic calculator.

I have tried to use the source code of these different calculators to work out what is happening behind the hood, but I always end up getting lost.

Here are my questions:

How does one choose a calculator and test statistic?
What is the proper way to inform the calculator of my systematic uncertainties?

I’m afraid that I am currently unable to provide a reproducible form of my code because it is very long and messy, needing input from multiple large ROOT files.

I would appreciate any advice regarding upper limit calculation using RooStats, even if it isn’t directly related to what I have asked. I have looked through a lot of tutorials and slides to find my answers, but I am still confused.

Thank you,
Arvind.

StephanH · June 13, 2019, 8:36am

Hi @avenkate,

Calculators

Here is some guidance:

ProfileLikelihoodCalculator
This calculator is the most simple thing you can do to retrieve the (1,2,3,…) sigma uncertainties. It scans the profile likelihood like MINOS in MINUIT does. If you are just interested in 1-sigma uncertainties, you can simply run MINOS.
HypoTestInverter
You need this for limits.
HypoTestInverter + AsymptoticCalculator
If you want limits, with CLs prescription or not, this combination is the best default.
HypoTestInverter + FrequentistCalculator:
If you have low statistics, the AsymptoticCalculator won’t work. As the name says, it’s asymptotically correct. In case of low statistics, you therefore have to fall back to the much slower FrequentistCalculator.
FeldmanCousins for intervals:
If you want intervals instead of limits, FeldmanCousins is the best bet.
FeldmanCousins for two parameters.
In this case, the AsymptoticCalculator doesn’t work either, as its formulae only cover the one-POI case.
BayesianCalculator for Bayesian credible intervals

TestStatistic

I think the profile likelihood is the only reasonable choice. What other test statistics do you have in mind?

Taking into account systematic uncertainties

In principle, these calculators are agnostic of systematic uncertainties. These have to be implemented in your likelihood by adding parameters that influence the likelihood model. The calculators will see this as changes of the likelihood. Further,

If you constrain the nuisance parameters, you are actually reducing the systematic uncertainties (because you tell the model that the parameter can only be in a specific range).
What makes you think that the systematic uncertainties are missing?
How did you implement the constraints?
You can actually test for the impact of systematics in a simple way. Set the nuisance parameters constant and rerun. You will get more optimistic limits. You can also make the constraints more aggressive (optimistic limits) or relax them (more pessimistic) if you want to play a bit.

I think the only thing you need to do is to tell the calculators about

Parameter of interest
Nuisance parameters
Which observables are global observables. These are the ones you use for the constraints such as obs_alpha1 in
Gauss( obs_alpha1 | alpha1, sigma1),
where alpha1 is the nuisance parameter that changes something in the likelihood, while obs_alpha1 is found nowhere in the likelihood but in the constraint term.

avenkate · June 13, 2019, 12:48pm

HI @StephanH,

Thanks a ton for your reply!.

When I artificially increase the systematic uncertainty (by increasing the width of my Gaussian Constraint), I naively expect the upper limit to increase, but I don’t see that happening with Asymptotic Calculator, which is the only calculator I’ve tried, apart from ProfileLikelihood. The other calculators are taking WAY too long to run

Say I want to constrain an efficiency, for which I have external input from MC, with a central value and an error.

w.factory(Form(“Eff[%f,0.,1.]”,eff_MC));
w.factory(Form(“Gaussian::eff_constraint(gEff[%f,0,1],Eff,%f)”,eff_MC,eff_Err_MC));
w.var(“gEff”)->setConstant();
w.extendSet(“nuisParams”,“Eff”); //add Eff to nuisance parameters
w.extendSet(“globObs”,“gEff”); //add gEff to global observables

The above lines of code show how I implement the constraint. Am I correct in setting the global observable to be constant? I picked that up from a ROOT tutorial code, but am unsure of why the global observable needs to be constant.

I did try this, turning off all systematics. But Asymptotic Calculator did not like that. It gave me errors like

ERROR:Generation – AsymptoticCalculator::MakeAsimovData:constraint term eff_constraint has multiple floating params - cannot generate - skip it

And my upper limit only reduced by a very small amount. I expected my systematic uncertainties to inflate my upper limits more that Asymptotic Calculator would have me believe. This is my biggest cause of concern now. I’m not sure if that upper limit calculation is properly taking the systematics into account.

I would appreciate any pointers on this.

Thanks again,
Arvind.

StephanH · June 13, 2019, 1:46pm

It could be that the systematic uncertainties can be completely profiled away. In this case, the constraint doesn’t do anything.
The other calculators are taking so long because they do actually compute the distribution of the test statistic, instead of assuming that its distribution is (asymptotically) known.

avenkate:

w.factory(Form(“Eff[%f,0.,1.]”,eff_MC));
w.factory(Form(“Gaussian::eff_constraint(gEff[%f,0,1],Eff,%f)”,eff_MC,eff_Err_MC));
w.var(“gEff”)->setConstant();
w.extendSet(“nuisParams”,“Eff”); //add Eff to nuisance parameters
w.extendSet(“globObs”,“gEff”); //add gEff to global observables

A quick tip: enclose code into three ‘`’, so it gets formatted as in the quote above. That makes it a bit easier to read.
Try to dump one of the constraint terms. Retrieve eff_constraint from the workspace and do eff_constraint.Print("t"). The global observable needs to be constant because you don’t want to the fitter to touch it.
However, you also have to register eff_MC as the global observable. It needs to be clear that this is the “target value” for the constraint.
More interesting is the question whether you are actually using the constraints. Are you
– either multiplying the likelihood and the constraints, and using Constrain(eff) when fitting
– or using ExternalConstraints(eff_constraint)?
If you don’t do either, they will simply be ignored.
Have a look at ROOT: tutorials/roofit/rf604_constraints.C File Reference
Further, are you registering eff_MC as a global observable?

Hmm, this might be caused also by not setting the global observable.

avenkate · June 13, 2019, 2:40pm

Hi @StephanH,

Thank you so much for your reply.

This is the output I get when I print out the constraint. The first RooConstVar is the Gaussian sigma.

  0x943eb2d0 RooGaussian::eff_constraint = 1 [Auto,Dirty]
  0x94410400/V- RooConstVar::0.026501 = 0.026501
  0x9445b6f0/V- RooRealVar::gEff = 0.13016
  0x94459eb0/V- RooRealVar::Eff = 0.13016

I think you’re saying that Eff should be a global observable too, right? eff_MC is simply a Double_t that I use to hold the number. Eff is the RooRealVar. And no, I didn’t have Eff in my global observables, only gEff. When I run again including both Eff and gEff in my global observables, Asymptotic Calculator complains

ERROR:Generation -- AsymptoticCalculator::MakeAsimovData: constraint term  eff_constraint has multiple global observables -cannot generate - skip it

But my upper limit result is now slightly lower.

I do use the constraint, but not in the manner you have suggested. Rather, I multiply my unconstrained fit model with the constraint pdf’s like

w.factory("PROD::model_const(model,eff_constraint)");

I then use model_const to perform the fit, and set this as the pdf in the ModelConfig.

I hope I’ve understood you correctly by including Eff in the global observables and not eff_MC? Asymptotic calculator doesn’t seem to like having more than one global observable for a constraint…

Thank you,
Arvind.

StephanH · June 13, 2019, 3:30pm

The constraint term looks correct. Would you just also “.Print("")” gEff and Eff to check that their ranges and const-ness are configured correctly? Eff should be allowed to vary while gEff should be constant.

The parameter of the likelihood model should not be a global observable. An observable is something that has been measured using data or MC (the target efficiency), and a parameter is something that changes the likelihood model. Therefore, gEff = globObs, Eff = parameter.
It is important to not mix up those two because the fitter touches the parameters, but it leaves the global observables untouched.

avenkate:

I do use the constraint, but not in the manner you have suggested. Rather, I multiply my unconstrained fit model with the constraint pdf’s like
w.factory("PROD::model_const(model,eff_constraint)");
I then use model_const to perform the fit, and set this as the pdf in the ModelConfig.

That’s the problem. It’s correct to multiply the likelihood with the constraint terms, but you also need to tell RooFit to actually apply the constraints. Check the tutorial I posted above to see this working.
[If you use a ModelConfig (I’m not sure you do) and set global observables and parameters correctly, the constraints will be switched on automatically.]

Probably that’s not correct. It’s a bit hard to decode what’s what from the factory syntax, but I guess it’s this:

eff_MC is a double. RooFit cannot interpret doubles
gEff is global observable that should be constant, and is has the value of the monte carlo efficiency
Eff is a parameter that changes the likelihood model, and it is constrained to be close to the value of gEff by means of the constraint term.

avenkate · June 13, 2019, 4:21pm

Hello @StephanH,

0x93eb9c00 RooGaussian::eff_constraint = 1 [Auto,Dirty]
  0x93ea7c70/V- RooConstVar::0.026501 = 0.026501
  0x93ef55f0/V- RooRealVar::gEff = 0.13016
  0x93ef3f60/V- RooRealVar::Eff = 0.13016
RooRealVar::gEff = 0.13016 C  L(0 - 1)
RooRealVar::Eff = 0.13016  L(0 - 1)

OK. So I think this is what I was doing initially. Only gEff is included in the global observables, not Eff or eff_MC.

OK. So now I include the Constrain() argument in my fitTo command, where I include all the parameters that should be constrained. My fit result doesn’t seem to change when I do this, and my Upper Limit from Asymptotic Calculator is also the same as before. And I do use a ModelConfig that I create after the fit

RooStats::ModelConfig mc("ModelConfig",&w);
mc.SetPdf(*(w.pdf("simPdf")));
mc.SetParametersOfInterest(*w.set("poi"));
mc.SetObservables(*w.set("obs"));
mc.SetGlobalObservables(*w.set("globObs"));
mc.SetNuisanceParameters(*w.set("nuisParams"));
// import model in the workspace
w.import(mc);

When I run Asymptotic Calculator, I feed it my workspace which contains the ModelConfig. And it tells me that it is including constraint terms in the minimization

[#1] INFO:Minization -- p.d.f. provides expected number of events, including extended term in likelihood.
[#1] INFO:Minization -- createNLL picked up cached consraints from workspace with 14 entries
[#1] INFO:Minization --  Including the following contraint terms in minimization: (eff_constraint,....)

Can I conclude based on this that my systematics are being taken into account in the Upper Limit calculation by Asymptotic Calculator?

I can’t thank you enough for your help.
Arvind.

PS: My case is definitely one of low statistics. I will explore using the Frequentist Calculator as well.

StephanH · June 14, 2019, 7:03am

I guess you can conclude that. Let’s make a simple test:
Force the constraint to something unreasonable, a MC efficiency of e.g. 1%. This should destroy the fit, and completely change the limits.
If that works, you are good to go.

avenkate · June 14, 2019, 3:25pm

Yes, I see the Upper limit respond to change in the constraint on the efficiency. I guess we can close the issue.

Thanks for all your help!,
Arvind.

system · June 28, 2019, 3:25pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.