# RooStats expected limit

Hi there,

I hope, someone more experienced than myself knows an answer to this. I couldn’t find anything helpful in the tutorials so far.

I am currently setting limits with unfolded data compared to MC truth level. This means, that all my systematic variations are applied to the observables and they also have a certain non-Poissonian statistical uncertainty. The parameter of interest on the other hand scales the expected distribution. Thus, both observed and expected values are non-constant.

Unfortunately, the procedure of calculating the expected limit seems to require one of the parameters in a given PDF to be constant, at least for the AsymptoticCalculator. If that is not the case, the process fails with “AsymptoticCalculator::SetObsExpected( … ) : Has two non-const arguments” as defined in

https://root.cern.ch/doc/master/AsymptoticCalculator_8cxx_source.html#l00952

Do I somehow have to treat my unfolded data as constant, although it is affected by systematics and has some statistical uncertainty?

Falk

Okay, I solved this problem now, by moving my systematic variations onto the number of expected events. As long as I am using a Gaussian, this is no problem.

However, for a Gaussian the same error occurs, if both the number of expected events and sigma are non-constant. How are you supposed to use a Gaussian, if both the number of expected events and sigma depend on the signal strength as it should usually be the case?

Cheers,
Falk

Hi,

I don’t understand what you mean of having systematic variations in the observables.
The problem you are having with the AsymptoticCalculator is that it is defined to work for certain types of PDF for computing the Asimov data set. These applies for models where the systematics is expressed as a Gaussian, a Poisson or a Log-normal constraint functions. For these constraints it is clear there should be only one non-const parameter. For example in a Gaussian the sigma expressing the constraint should be fixed otherwise we cannot generate the corresponding Asimov data set.
Usually this is the case. For you if it is not, can you please attach maybe your model, so I can understand it better

Best Regards

Lorenzo

Hi Lorenzo,

Let me upload my workspace (name “w”) here:
workspace_mH=500.root (36.4 KB)

It contains observations in several bins (here called bin1 to bin6) which have been unfolded and are now compared to MC truth. This means the experimental systematics (all called NUI_… in my workspace) are on my data, not on my expected events. Fortunately I am using Gaussians for p(obs|exp(µ)), so it does not matter if I put them on the obs or mean side. A Poissonian would not represent my likelihood, since obs was corrected for efficiencies and bin migrations in the unfolding, so I am using a Gaussian. But the total sigma is dependent on both my observed statistical uncertainty and my MC statistical uncertainty. And if I want to add them, it depends on my parameter of interest. In most cases I have large MC statistics, so the uncertainty is negligible, but I do not see how to proceed, if that is not the case.

I hope, this is somehow understandable.

Cheers,
Falk

Hi,

Still it is not clear to me the statement that the systematics is on the data. You have some observables, x that can be modelled with a given pdf . It is clear in case of unfolding you cannot model this with a Poisson, but you need a Gaussian or something else. Actually I think it is more complicated because due to the unfolding procedures every bin is correlated, so you would need a multi-variate Gaussian.

However, neglecting the correlation you should be able to model a bin like

Gaussian( nobs | nexp( mu, b1,…bn) , sigma ( mu, b1,…bn) ) * Gaus(b1) *… * Gaus(bn)

where mu is the parameter of interest, b1,…bn are the systematics. Each Gaussian constraint term must have only a single non-const parameter (the systematics) and a fixed sigma and a global observable (which must be set as constant in the workspace).

Lorenzo

Hi Lorenzo,

thanks again for your time. You are right, ideally I would use a multi-variate Gaussian, but since their evaluation seemed very slow, I neglect my (small) correlations. What I meant with systematics on the observed number of events, was that all my nuisance parameters influence the unfolded result nobs and not the expected number of events, so initially it looks like this:

Gaussian( nobs( b1, …bn ) | nexp( mu ) , sigma ( mu, b1,…bn ) ) * Gaus(b1) *… * Gaus(bn)

But as you can easily shift the addition of the variations from the observed to the expected side in a Gaussian, it can also be written like this

Gaussian( nobs | nexp( mu, b1, … bn ) , sigma ( mu, b1,…bn ) ) * Gaus(b1) *… * Gaus(bn)

My original problem was, that now both nexp and sigma depend on mu in the first Gaussian. Isn’t this, what makes the Asymptotic calculator crash, since two parameters are non-constant?

Cheers,
Falk

I will check this in a simple model. It could be that the calculator cannot generate Asimov data set in this case.

Cheers

Lorenzo

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.