I have a comprehensive question concerning the Asimov dataset used in the asymptotic formulae (Eur. Phys. J. C 71:1554 (2011)).
I am currently doing an hypothesis test and I’m amazed how much faster than the conventional MC-based tests the AsymptoticCalculator is.
I read the paper carefully and tried to understand the Source-Code but I wasn’t able to figure out what the point behind the generation of the Asimov dataset is. The way that I understood it, is that the nuisance parameters are set to their expectation values which are simply their best fit values for the null and alt (in the case of a Hypo-Test for discovery this would simply correspond to the conditional max. likelihood for mu=0 and the maximized likelihood for mu=mu’). Is that right?
Wouldn’t I then be able to compute the non-centrality parameter from the equations (29) and (32) without using Asimov data? I.e. take the observed test statistic instead of the one obtained from the Asimov dataset.
You cannot compute the non-centrality parameter, Lambda, without the Asimov data set, because you don’t know the sigma value. As described before (29) in the paper, the sigma obtained from the second derivatives, it is not correct, because it depends also on mu’.
You cannot use observed likelihood value, to obtain sigma (for example from equation (17), because the sigma in (17) might be different, since it is having this mu dependence.
So, basically the Asimov data set it is a trick which allows you to estimate the non-centrality parameter using
Thank you for the clarification. I got that point now, but I’m still puzzled about the properties and also the definition of the Asimov dataset. I’m worried to miss a very fundamental point here.
For example, I’m wondering why the Asimov profile likelihood ratio returns other values than the usual one?
Is the condition in eq. (23) the main point to it? I.e. do the best fit values of the nuisance parameters to the data not fulfil this condition?
The Asimov profiled likelihood ratio returns a different value because it is evaluated not on the observed data, but on the Asimov expected data, the one fulfilling eq (23).
It is true eq (23) is used to find the ML estimate. But here we use not to find the best fit value of parameters but to find the values of the observables (n), which satisfy that equation given the nuisance parameters values.
WHen making the Asimov data set we fix first the value of mu, then we fit (on the data) the nuisance parameter
values conditionally on that mu values.
Using these nuisance parameter values we can find the observables, as those values, satisfying equation (23)
for the given parameter points.
Thanks a lot, now I think I got it and I can call it a day with quiet conscience
I now tried to reproduce Fig. 9 from the mentioned paper using the StandardHypoTestDemo on my own dataset.
For the attached plot I used the frequentist calculator with a one sided profile likelihood test statistic and compared this to the results from the asymptotic calculator with the q_0- test statistic (defined in eq. (12) in the paper). For the red, green and blue curves I used the the formulas (47) and (48). The difference between the green and the blue curve is the non-centrality parameter, which I set to the Asimov evaluated q_0,A (green) and to the observed q_0 from the frequentist evaluation (blue).
Is it true that the half chi-square distribution doesn’t depend on any information from the data, but solely on q_0? If so, why is the MC-generated data so different from the half chi-square?
For the Alt distribution I was quite surprised that using the observed q_0 as non-centrality parameter reflects the MC data to some extend, but I think that this is just a coincidence.
I suppose that I used the “wrong” test statistic for the plot, i.e. there are differences between f(q_0|muhat) in a asymptotic and frequentist framework. Also, the observed q_0 from the frequentist and asymptotic calculators differ significantly (6.88 vs. 13.77). Is this due to the different treatment of the nuisance parameters?
Basically, the answer to the following question should solve the others I had as well: how do I reproduce fig. 9 in the paper?
Best regards and thanks in advance,
HypoTestPlotFrequentist_4kTMC_os.pdf (20.3 KB)
I think there is a factor of 2 missing. The values of the test statistics of the frequentist calculators which are displayed are not q_0 but q_0/2 (i.e. just the negative log of the profile likelihood ration instead of - 2 log lambda). You are getting as observed value 6.88 which is 1/2 of 13.77.
You should consider that when making figure 9. We should maybe apply this factor of 2 in roostats to avoid this confusion
I should have noticed that factor 2 in there , thank you!
Now everything looks very nice!
But I’m afraid, I have another question concerning the verbose output of the asymptotic calculator. It reads:
[#0] PROGRESS:Eval – poi = 0 qmu = 13.7736 qmu_A = 4.17273 sigma = 0 …
What is the meaning of qmu_A in there? I attached the plot with the correct q_0, where I used these values of qmu (blue) and qmu_A (green).
So I would conclude that the non-centrality parameter is qmu, which means that the PLR of the Asimov data returns the same value as the PLR evaluated on the data. Could this be due to the large dataset I used (approx. 50 k events)?
HypoTestPlotFrequentist_4kTMC_os_2q0.pdf (37.5 KB)
The output of the Asymptotic calculator should print the correct value of qmu_Asimov.
But looking at the plot, you have the observed value of qmu as the median of the alternate distribution, this means that it should be equal to qmu_Asimov. So I guess you are running the Asymptotic calculator with a different model configuration, or treating the systematics not in the same way as in the frequentist one,
or there is a problem in making the Asimov dataset with your model.
I presume you will get very different expected p values between asymptotic and the frequentist calculators.
If you want, just send me your workspace as a root file (with ModelCOnfig and dataset) so I can understand better this
I found the problem with your Asimov data set. The default number of bins used in your model (100) is too low.
By setting an higher number of bins (e.g. 200) you get now a correct expected significance (in agreement with the observed one).
I have added in the trunk a way to change the number of bins for the Asimov in the Asymptotic calculator. However, you can do it also by fixing it directly in the workspace. Just do for example
Thank you very much for all your support! Now everything works fine.