Likelihood option of Gaussian fit to no binned data

Hi, Rooters,

My data is no binned and low statics following Gaussian distribution.
The X-axis is charge(measured directly, without known errors) and Y-axis is events typically less than 10.

I’ve tried Chi-square and Likelihood both to do a Gaussian fit, accordingly, options are “R”, or “LR” where “R” means range which is roughly ± 2 sigma).
The Likelihood one resulted smaller sigma while Chi-square did a bigger one.

In principle, I think these two fit options should get same fit results if one refers to, for instance, the section of " The method of least squares" in the statistics chapter of PDG(reviewed by G. Cowan)

While in the section of “Fitter settings” of " ROOT Users guide", it mentioned explicitly, “The Binned Likelihood is recommended for bins with low statistics”.

My questions are :
(1) In my case, low statistics data follows Gaussian distribution, two fit options performed different, likelihood should be better, right ?
(2) Why likelihood should be better(not consistent with the statistics chapter of PDG)?

Thanks !

Best,
Junhui

ChiSquareVSBinnedLikeloodGaussianFit.pdf (22.7 KB)Hi,

I try to answer the problem by myself.
Any comment(s), please feel free to post here.

My understanding is : for a Gaussian fit, in the case of big statistics, Chi-square and likelihood have very little difference, while in the case of small statistics, these two methods have big difference.
If data has small statistics, its histogram usually has empty bins. This is exactly where the difference arises : for Chi-square method, it only calculates the bins with non-zero bin content; while for binned likelihood, it calculates all of the bins no matter their contents are empty or not.

However, for a low stastics histogram following gaussian distribution, it’s reasonable to assume some bins are actually empty.
Therefore, personally, I think it probably makes sense to state alternatively like this : for a Gaussian fit, if a histogram has empty bins, it’s encouraging to use binned likelihood option(“Low statistics” is a kind of vague language because binning also effects a bin empty or not). While if a histogram has no empty bins, there is no difference to use these two methods(to do a Gaussian fit) : Chi-square or binned likelihood.

The attached PDF file is my statement relies on.
The upper two histograms are two identical Gaussian ones, fitting with two methods, fit results have no big difference. Both of the lower two ones are identical either, 20 events randomly selected from upper ones.
Pay special attention on the “ndf” number for the lower two histograms please.
For the lower left plot, it’s 13 = 16(not empty bins) - 3(# of fit parameters); for the lower right plot, it’s 97 = 100(all bins) - 3(# of fit parameters).

Best,
Junhui

Hi,

  1. What counts is not of the x values are distributed, but given a bin what is the statistical distribution of the bin content. If given the expected event per bin (which will depend on the x distribution), the distribution of observed events is a Poisson, the likelihood method (option “L”) is the correct one.

  2. Again low statistics refers to the bin contents. If the observed counts per bin is large the Poisson distribution can be approximated by a Gaussian distribution. In this case a likelihood method with a Gaussian distribution for the bin content is exactly equivalent to a least square method, if you use the expected errors for the bin content
    (option “P”). However, for high bin statistics the observed error (square root of the bin content) becomes very similar to the expected error.

So in conclusion, if you have bins with low statistics (less than 5 entries), and you are fitting counts, use always the option “L”.
If you have all original data X, before having filled the histogram, you could also do un unbinned likelihood fit.

For more information, apart from the Stat chapter in the PDG, I recommend to look also at chapter 2 of the book
amazon.com/Data-Analysis-Hig … 3527410589

Best Regards

Lorenzo

Hi, Lorenzo,

Thanks for your detailed reply !
I appreciate it !

Before reading the book you recommended, I have two comments :
(1), In my second post, my indication is for un-binnned data.
Because if talking about binned data, the binning itself will create or destroy some empty bins.
This way, the empty bins in histogram are not real empty bins but “fake” ones, meaning: not only the X-place will shift, but the number of empty bins will change also.
So, the binning technique will produce additional errors I think.

(2). However, according to my understanding, one of the drawbacks of likelihood option is, the Chi-square/ndf is not as meaningful as the option of Chi-square. So people tends to use the option of “Chi-square” to do a fit to get a hint of how the fit quality it is by judging the value of Chi-square/ndf.
Do you know in the cases of likelihood option, how to evaluate the goodness of fit parameters ?

Thanks !

Best,
Junhui

Hi,

  1. If your data consists of number of events measured /charge , they represent a frequency distribution of the charge (i.e. they are an histogram). This is what is meant as binned data. Unbinned data would be just a set of charge measurements.

  2. When using a likelihood method to fit an histogram, you can do two things to evaluate the goodness of fit:

  • compute a chi2/ndf with the obtained best fit function
  • if using the option “L” in ROOT use as chi2 2 * times the log-likelihood value returned from the fit. The log-likelihood is computed following what described in Baker-Cousins, NIM 221 (1984) 437-442 and follow a chi2 distribution. So it can be used to judge the fit quality.

Instead if you are doing an unbinned likelihood fit you cannot use the likelihood value for the quality of the fit. You should either binned the data and do a chi2 test or use the Kolmogorov-Smirnov or the Anderson-Darling tests.

Best

Lorenzo

Hi, Lorenzo,

Thanks for your second post !

One thing need to be corrected.
Previously, I mentioned that my data is un-binned data. It turns out wrong.
I have tried to increase the number of bins very big to my histogram, but there is no way to make the histogram really un-binned. There always exist some bins have been binned.
Somehow, for my data, to make it totally un-binned sounds impossible.
Meanwhile, having big enough bins definitely benefits in my analysis.

Another thing need to know more clearly is the method of a goodness-of-fit with likelihood option.
From the paper you mentioned(thanks for your sharing), the correct way to evaluate it is the so called “Chi-square_{lambda, p}” for the case of Poisson distribution for bins.
Combing your previous post, it looks like one can get the “Chi-square_{lambda, p}” by an calculation based on fit results(with the option of “L”).
Unfortunately, I’m not sure I understand correctly this in your previous post : “chi2 2 * times the log-likelihood value returned from the fit”.
Could you please clarify more clearly how to get the “Chi-square_{lambda, p}” with the fit results of option “L” ?

Thanks !

Best,
Junhui

Hi,

IN case of a likelihood fit you can use as a chi2 for the goodness of fit the double of the negative log-likelihood value. Thus, you can do for example:

TFitResultPtr r = h1->FIt( "gaus","L S"); 
double chi2 = 2 * r->MinFcnValue();

Cheers

Lorenzo

Hi, Lorenzo,

Thanks for your recent reply !

I have two questions and one extra comment.
1). the meaning of option “S”.
In your code, there is an option of “L S”.
I guess “L” is likelihood, but what the “S” stands for(I can’t find it on the users guide) ?

2). “L” and “LL”.
From the users guide of ROOT, for a fit, there are two options of “L” and “LL”.
For “LL”, it says “An improved Log Likelihood fit in case of very low statistics and when bin contents are not integers”.
Can I interpret like this : the option of “LL” corresponds a fit using the equation of (2.89) in the chapter2 of the wonderful book you recommended ? While for the option of “L”, it uses a general one like (2.5) ?

3). Comment on the chapter 2 of the book.
I haven’t finish the whole book yet, but chapter 2 is the best one with similar topic I’ve never ever read before, no one of, :slight_smile:.
There is a piece of suggestion either, if I understood correctly, the option of “LL” should correspond to (2.89) in the section of 2.5.3. If this is the case, it probably makes sense to add a footnote on the page to indicate explicitly how to apply this equation with ROOT.
Often, readers not only wanna know the principle of this method, also how to use it. I guess most of readers of the book use ROOT.

Best,
Junhui

Hi,

  1. The option “S” is in the updated version of the User Guide. Unfortunately is not yet online. You can find documented also here
    root.cern.ch/root/html/TH1.html#TH1:Fit@1

2)option “L” corresponds to eq. 2.89 in the book. The case “LL” unfortunately is not documented in the book. It is the case of a weighted histogram. Weighted likelihood fits are documented in other book, F. James , “Statistical Methods in Experimental Physics”, second edition, paragraph 8.5.1

  1. Thanks for the nice comments about the book

Best Regards

Lorenzo

Hi, Lorenzo,

Thanks for your reply again !

Still, I have one more question to ask, thanks ahead for your patience, :slight_smile:.
If, as you mentioned, " option “L” corresponds to eq. 2.89 in the book ".
Under ROOT, which options corresponds to the eq. 2.5 and 2.80 of the book ?

Or, let me put in this way: facing a histogram to be fit, assuming the options are identical as “L”, how ROOT picks up an appropriate equation (2.5), (2.89) or (2.80) to do a fit ?

Probably, each histogram has flags to indicate the cases of binning, then use a “switch-like” technique to distribute which equation should be called to do a fit ?
However, from this page, root.cern.ch/root/html/TH1.html, I can’t find such flags or functions. Could you please give me a hint on this ?

Thanks !

Best,
Junhui