Histogram "W" and "WW" fit options return meaningless errors of parameters

Wile_E_Coyote · March 9, 2020, 11:26am

When fitting a histogram with the “W” or “WW” fit options, the returned errors of parameters are (often? usually?) several tens times smaller than they should be.
I don’t really know how to get them right (or “correct” them somehow).

This problem does not appear when one fits some TGraph or TGraphErrors with the “W” option.

It can clearly be seen with the attached small test macro.
It creates and fills a simple “h” histogram and then several fit options are tried. Then a corresponding TGraphErrors “g” is created and again fitting with and without “W” is tried.
Looking just at the very first parameter (the “area” of the peak), one gets (“p” is either the histogram “h” or the corresponding TGraphErrors “g”):

p option: chi2          : area +/- error
h B+    : 13.5638       : 493.391 +/- 15.7093
h WB+   : 839.915       : 493.94 +/- 1.63508 <- much too small
h WWB+  : 839.915       : 493.94 +/- 1.63508 <- much too small
h LB+   : 14.9345       : 500 +/- 15.8114
h WLB+  : 14.9345       : 500 +/- 15.8114
g B+    : 13.5638       : 493.391 +/- 15.7093
g WB+   : 839.947       : 493.938 +/- 11.4928 <- perfectly fine

gauss_area_errors.cxx (2.7 KB)

moneta · March 9, 2020, 12:26pm

Hi,

The option “W” and “WW” set all the bin errors to 1. If you have an histogram representing counts, this is not correct and it makes sense that the resulting errors are not correct. Probably this option should be removed, because it does not make much sense to me.
It has been there since the beginning and I personally don’t know the reason to be there.

In the case of TGraph fitting you can have the case where errors are not present (fitting a Graph and not a TGraphError). In that case one can assume an equal weight for each point (e.g. W=1). However the difference is that in this case the fit parameters errors are re-normalized using the obtained chi2 value.
This explains why the errors are almost correct, a value very close to what you would obtain if using the correct error for each data point (bin)

Lorenzo

Wile_E_Coyote · March 9, 2020, 12:29pm

So, could you also, please, “re-normalize” the errors when histograms are fitted with the “W” and / or “WW” options?

moneta · March 9, 2020, 2:37pm

Yes we could do that, although as I said before option W should not be used for an histogram.
A TGraph is different, it is a set of point and in case of TGraph (no TGraphErrors) there are no errors.

Lorenzo

Wile_E_Coyote · March 9, 2020, 3:15pm

The “W” and “WW” options are extremely useful when histograms have bad errors (for any reason) and / or when one just wants the chi^2 fit without considering errors. Please do not remove them.

moneta · March 9, 2020, 5:25pm

I have opened a PR, https://github.com/root-project/root/pull/5114 to correct the errors also for TH1::Fit with option “W” (or “WW”)

Lorenzo

Wile_E_Coyote · March 9, 2020, 6:16pm

I did not check how it behaves for graphs but, I assume one needs to “re-normalize” the whole covariance matrix, not just the errors.

moneta · March 10, 2020, 9:26am

Yes, also the covariance matrix is corrected using the same factor ( chi2/(ndf-1) )

Lorenzo

Wile_E_Coyote · March 24, 2020, 9:33am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.