Unexpected behaviour of turning on/off binned likelihood fit in the context of reordering samples

Dear experts,

we are running profile likelihood fits using HistFactory-based machinery and we see some strange results when we simply replace the order of samples when creating the workspace. We tried to track down this problem and we identified that setting the “BinnedLikelihood” option makes a big difference on obtained values of the correlation matrix. We attach a simple macro that reads two different workspaces (they differ only in the order of the samples) where we can see different elements of the correlation matrix based on the order of the samples. The differences are much bigger when the BinnedLikelihood is turned on compared to when it is turned off. In the attached slides we provide the obtained values by running the macro using different minimisation strategies. Is this behaviour expected? When is it save to use the BinnedLikelihood option? Please note, that the differences in the correlation matrix elements are large, sometimes even on the first significant digit.

We ran the tests with ROOT 6.18.04.

Cheers,
Philipp

Slides_Root_Forum.pdf (80.5 KB)
simpleFit.cxx (2.8 KB)
WmunuselMuon_Pt_eta1_min_cut_7_BF_combined_WmunuselMuon_Pt_eta1_min_cut_7_BF_model.root (57.0 KB)
WmunuselMuon_Pt_eta1_min_cut_7_SF_combined_WmunuselMuon_Pt_eta1_min_cut_7_SF_model.root (55.4 KB)

Hi @philtk74,

that’s a nice observation. At the moment, I can only state the obvious thing, i.e. that reordering floating-point computations (i.e. reordering samples), will not necessarily yield the same result. However, I wouldn’t expect such a large difference.
The BinnedLikelihood, however, reorders the computations quite a bit, so it might have a larger effect. Are the central values and uncertainties comparable?

Hi,

we discussed this a bit, and the best explanation we can come up with is the reordering of floating-point computations. After all, most of the differences are compatible with such uncertainties. The big difference only shows up for a single parameter. This could be related to catastrophic cancellation if e.g. one sample is small compared to others. Can it be that the signal is very small compared to the backgrounds?

The reason that BinnedLikelihood might make a bigger difference is that it computes
binWidth * sum(samples)
whereas the classic workflow computes
sum( binWidth * sample)

To ensure that it’s not something else, could you post what minuit prints at PrintLevel 1 or 2, that is:

  • Post-fit parameters with errors
  • EDM
  • NLL
  • Covariance matrix

Hi Stephan,

thank you very much for looking into this, I extended the slides and added the asked printouts. Let me know if you need more!

Cheers,
Philipp

Slides_Root_Forum.pdf (107.6 KB)

Hey,

we don’t understand the matrices. It’s not the covariance matrix, so what is it? Could you try to get the covariance matrix from Minuit2?
Something like:

auto result = extmodel.fitTo(*data, Range("LEFT"), PrintLevel(2), Save());
result->covarianceMatrix().Print();

If you have the time, could you run this with Minuit as well? There might be a bug with strategy 0, which results in a weird (whatever this matrix is).

Hey,

this is the correlation matrix, sorry! Of which type is “extmodel” in your code snippet? I also uploaded the script I use in my first post.

Cheers,
Philipp

Oh, that’s from a random tutorial. This is something that derives from RooAbsPdf, and you can ignore the Range parameter.

Hey,

this should hopefully include everything you asked for.

Cheers, Philipp

Slides_Root_Forum_V3.pdf (145.6 KB)

I had a look at the covariance matrices, and I would just like to confirm that the parameters in question (the ones with the largest differences) are the ones in columns 0 and 1.

Hi Stephan,
it’s not about specific parameters but more about the significant differences itself. This is just an example, so we also have seen this behaviour with other inputs.
Cheers, Philipp

Ok, we estimated that the differences are only significant for those two parameters. It’s strange, though, that these occur in this way, since the function values passed to Minuit are very similar. There is a global normalisation difference, but the offsetting of the NLL removes that at the beginning. This will need more investigation …
@moneta is looking into it.

Hi,
I have looked at your slides and your example and I have not seen big differences in the covariance matrix values. The small seen differences, especially in the off-diagonal terms are in my opinion easly caused by numerical error, since the order in computing the likelihood is different.

I see you use strategy 0. I would recommend not to use it, only if you are interested in function minimum value without errors. As you see when using strategy zero you have errors which are very different. This is expected, since those errors are computed before calling Hesse. After calling Hesse you are having compatible errors.

Lorenzo

Hi Lorenzo,

thank you very much for looking into this. However, the main point is when we really look at correlation matrix, and not covariance matrix we see rather large differences based on the order of the samples, as is shown in the pdf from Philipp. From the values in the slides, the differences are not small, there are of the order of few % in correlations. This can potentially become problematic when we use these values further down the line.

Furthermore, when the BinnedLikelihood optimisation is turned off, the best agreement for correlations between different order of samples is when using strategy 0, which is very surprising to us. Using strategy 1 leads to larger differences.

Cheers,
Tomas

Hi Tomas,
the correlation matrix is just computed from the covariance, normalising it.
Now I see between binned/not-binned difference of around 10% in the errors, which then might result in 10% (absolute) difference in correlation values. I think this is not too bad.

If you need more precise correlation values, you might need to have a more precise model evaluation. One possibility is maybe re-define all parameters with a new scale such that their errors are close to 1, so you will have a Hessian matrix that it is easier to invert.
One another thing you might try is to decrease the tolerance until a limit that Minuit will tell you that he can not reach some small values and it will be limited by precision.

Concerning your question about strategy 0, it looks to me to be more a coincidence.

Lorenzo

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.