Specifically, to build the likelihood I am using the Poisson likelihood chi-square in page 3 of the following reference:
where I add penalty terms to take into account the systematic deviation of the parameters. The predicted number of events for each bin is computed by weighting the MC templates with the parameters.
I do not know if there is a more safe way to implement the likelihood.
The procedure is fine, I hope you compute directly the log , i.e. the sum of mu-n*log(mu) for each bin where mu is the number of expected events and n the number of observed ones.
When summing I would reccmend using the Kahan summation, which minimise the numerical error in the sum. We have a class for doing it in ROOT,
see https://root.cern.ch/doc/master/classROOT_1_1Math_1_1KahanSum.html
This might help in computing the likelihood with more precision
yes, I compute the terms for each bin, but I sum them within a for loop. I perform also other sum in for loops, for instance to get the predicted events for each bin.
Moreover, I directly multiply a matrix M (TMatrixD) and vector v (TVectorD) to get the penalty term (something like v* M*v). I will replace this operation and the summation within the for loops with the Kahan summation and post the result of the test.
To simplify I just consider the Poisson terms (switching off the penalty constributions), so I am performing the Kahan sum of mu - n + n*log(n/mu) terms (mu is the number of expected events and n the number of observed ones).
The output in this case is the one I reported below, and the profile likelihood looks like this around the minimum:
@moneta or any other expert that might help: do you think there is something else I could try to make the fit work also with such large bin content? Or it might be the case to change method? Like Bayesian with mirginalization of the nusiances, but I am not sure if this wouldn’t reproduce the same problem…
I have also tried to make a sort of internal rebinning (internal to the FCN, by splitting the content of each bin in such a way to reduce the statistics), but this does not seem to help.
Can you try the summation of mu -n log(mu), it should be better than doing mu -n + n*log(n/m) where you are adding some useless constant term(i.e not depending on mu)
yes @moneta, sorry. I forgot to mention: I tried to use both mu - nlog(mu) and also a simple chi2 (n - mu)**2/n (since I thought this at the end should be the case given the high statistic). Unfortunatelly in both cases I do not solve the problem.
For the simple chi2 it “seems” that the fit converges (see below), but when I perform the profiling I get a shifted minimum (should be b0 = 0.6 from the fit, and the same thing happens for b1):
I am not sure I have understood exactly what is happening here. Are you computing the profile likelihood as function of b0 minimising all other parameters as it is done within Minos ?
Have you tried computing the Minos errors ?
If you can post the program reproducing the problem it would be useful
I am not sure I have understood exactly what is happening here. Are you computing the profile likelihood as function of b0 minimising all other parameters as it is done within Minos ?
yes, right. I am building the profile as function of the parameter by fixing it and minimizing w.r.t. all other parameters. This just for a cross-check and to get a feeling of the shape around the minimum.
Have you tried computing the Minos errors ?
yes, I also compute Minos errors, and they look unrealistically small:
1-sigma minos errors:
par b0: 0.606954 -7.66074e-08 7.66074e-08
par b1: 0.896297 -4.00856e-08 4.00856e-08
If you can post the program reproducing the problem it would be useful
Sure, I will prepare a minimal working example reproducing the problem and post it. It is probable that I am missing something.
Then when running Minos, it would be interesting to see the log.
But how it is produced the scan plot above showing a minimum for b0 around 0.85 ?
I think such small errors are expected given the large statistics. These Minos error looks to me consistent with the ones obtained from the Hessian matrix.
so, I uploaded a txt file with the log from Minos.
I think such small errors are expected given the large statistics. These Minos error looks to me consistent with the ones obtained from the Hessian matrix.
but then the values of the poi from the fit would be to far away from the true values, I mean, not compatible. It looks more reasonable the minimum from the “manual” scan. Maybe I am getting confused…
When I apply Hesse, I get something that I interpreted as not much reliable, since the covariance matrix status is 2. The log is in OUTPUT-1.
But how it is produced the scan plot above showing a minimum for b0 around 0.85 ?
This is produced by using migrad for each point: the starting point is near the minimum found by the initial minimization (for b0 the value is 0.6), then I iterate for b0 values that move away from the minimum.
Thank you for the log. I see that Minos failed for both b0 and b1 (lower error). I am not sure if the upper error determination works.
The scan figure of chi2 vs b0 cannot be true. The minimum should be at b0 ~ 0.6 and if the error are so small the parabola should be much narrower around the minimum.
I think I would need to run your code to see what is really happening
I managed to setup a test code that reproduces what I am doing.
Within fitTest, the fN parameter allows to set the normalization coefficient to reproduce the expected statistics for data. There are two values, one for low statistics, where the fit is working (the likelihood profiles are as I would expect them), and one for high statistics, where the fit is showing the behaviour I discussed.
In Chi2FCN there is the possibility to use a simple chi2 or the mu -n + n*log(n/m) (did not manage to make mu - n*log(m) converge).
I see your code performs several fits and many of them fail because of precision issues. It is difficult to me to find the problem mentioned before with Minos.
Can you please simplify the code, so I can exactly reproduce the Minos problem ?
Thank you
I see your code performs several fits and many of them fail because of precision issues. It is difficult to me to find the problem mentioned before with Minos.
yes, right. The code I provided includes also the likelihood profiling, which performs several fits;
Can you please simplify the code, so I can exactly reproduce the Minos problem ?
yes, I am attaching a new version in which only the fit and Minos analysis are performed;
please note the @moneta will be on vacation the next two weeks. If you check in once a week by writing a short message, the topic won’t get closed.
If you have a CERN account, you can also create a JIRA ticket on https://root.cern (button with the bug at the very bottom), and add a link to this thread.