# ProfileLikelihoodCacluator::GetHypoTest->Wrong Significance?

Hi all,

I simplified a problem of hypothesis testing to a (very simple) case which I’m running in ROOT v5.30.06 with roofit/roostats. A file is attached that runs my example which is:

Problem: Have a known background of b=noffObs events and observe nonObs=s+b events.
Test with the profilelikelihoodratio for the null hypothesis s=0 and for the alternative hypothesis s>0
-> One Parameter (s)
-> Likelihood function for Null hyptohesis L0=TMath::Poisson(nonObs,b)
-> Likelihood function for alternative hypothesis L1=TMath::Poisson(nonObs,b+s) which is maximal at s=noffObs-nonObs.

For the concrete example of nonObs=150. and noffObs=100. this means
-> Null hypothesis -log(L0)=-log(TMath::Poisson(150.,100.)) ~ 14.24
-> Alternative hypothesis -log(L1)=-log(TMath::Poisson(150.,150)) ~ 3.22

The problem is that in the GetHypoTest() function below (implemented in ProfileLikelihoodCalculator.cxx) the following values are calculated:

Variables in ProfileLikelihoodCalculator::GetHypoTest():
Double_t NLLatMLE = fFitResult->minNll(); (this variable should hold -log(L1) as it is the maximum likelihood for the best fit of the likelihood function to the data leaving all variables floating)
What I get is however ~18.56 instead of the expected 3.22

For the variable
Double_t NLLatCondMLE = fit2->minNll();
holding the -log(L0) for the null hypothesis
in GetHypoTest() I get the value 40.4 instead of the expected value 14.24 for the null hypthesis likelihood

What is going on here? Am I misunderstanding the fit->minNll() result (maybe due to some normalization??)? Or am I running my test program in a wrong way?
Thanks alot!!
rooStatTest.C (3.85 KB)

Hi again,

I produced a significance distribution for the problem stated above (i.e. known background b and poisson distributed number of measured events with mean b).

The comparison of my simple model for the likelihood ratio (explained above and in the attached script used to produce the significance distribution) and the output of the ProfileLikelihoodCalculator::GetHypothesis()->GetSignificance() output shows:

The width of the output of the RooStat routine significance distribution is not compatible with a standard normal gaussian, in contrast to my simple model. However, as far as I understand, the model that is generated in RooStat and the output in RooStat shoud be exactly the same (up to some possible numerical issues which can however not account for the large discrepancies) as my simple likelihood ratio model stated analytically.

I can see two possible reasons for this:

1. I’m using RooStat in a wrong way: It would help enormously if some expert could check the implementation of my RooStat model for the problem stated

2. The GetHypothesis() function of the ProfileLikelihoodCalculator is not using the right variables for the extrema of the (log)-likelihood functions.

Maybe some expert could have a look at this problem …
SignifDistribution.C (7.45 KB)

Hi,

There are a couple of problems in your code, which cause this effect you are seeing.
First of all you should not add “b” in the data set. It sill be considered as an observables and it will be normalized in the likelihood. So, when creating the data set to pass to the ProfileLikelihood Calculator just do:

``````  RooArgSet* ArgSet = new RooArgSet("args");

RooDataSet *data = new RooDataSet("modelData", "modelData", *ArgSet);

(I have commented out the line using b, which you should remove)

The second problem, appearing when running your second macro, is in fitting. In your second example (distribution of significances), you can have cases when the fit in the ProfileLikelihoodCalculator gives wrong results for very small s. This happens bacause you are starting with a value of s too large and too far away from the minimum. The starting value is by default half of the RooRealVar range (e.g. 200 in your case).
I will do, before running the ProfileLikelihoodCalculator something like:

``````  RooRealVar *xs=wspace->var("s");
xs->setVal(nonObs-noffObs);``````

or to some closer values.

After these changes your macro works fine to me. I attached it

Best Regards

Lorenzo
SignifDistribution.C (7.47 KB)