About fit method, Maximum Likelyhood or Least Square?

alenym · March 9, 2006, 7:18am

When I fit a histogram, I may do it like this , h->Fit(“gaus”).
Then I get the some result like this:
FCN=224.62 FROM MIGRAD STATUS=CONVERGED

what is MIGRAD , I don’t know how it works .
I don’t know what method it use, Maximum Likelyhood or Least Square.

Anyone can help me?

Thanks ahead.

moneta · March 9, 2006, 10:11am

A good description on the Maximum likelihood method and Least Square method for fitting histograms you can find in the PDG statistics chapter

pdg.lbl.gov/2004/reviews/statrpp.pdf

or in any statistics books for nuclear and particle physicists.

MIGRAD is the main minimization algorithm of Minuit used to minimize the Chi Square function or the log of the Maximum likelihood function.

More information you have also in the TMinuit doc

root.cern.ch/root/html/TMinuit.h … escription

Best Regards,

Lorenzo

germano · March 11, 2006, 12:13pm

A little remark about the paper suggested (If I can…):

those method are particular cases of the general Bayesian framework,
so the criticism about the “subjectivity” of Bayesian approach is not so
much fair…

moneta · March 14, 2006, 9:09am

These method are generals, theoretically the both apply for frequentists and Baysian statistics.
The two differ in the interpretation, for example take the maximum likelihood method. Here, to estimate the unknown parameters p, you maximize the likelihood function L§.

for frequentist the likelihood is not the p.d.f of the parameters but the joint p.d.f evaluated with the data obtained in the experiment and regarded as a function of the parameters
for baysian statistics one can use the likelihood to get the p.d.f for p, once the prior of the parameters p is known (from the Bayes theorem)

We generally apply this method within the frequentist statistics, reporting at the end a parameter estimate and an error interval, obtained from the fit and based on the likelihood function extracted from the data, and without using any prior (or subjective input) for the parameters.

Cheers

Lorenzo

germano · March 14, 2006, 10:47am

[quote=“moneta”]These method are generals, theoretically the both apply for frequentists and Baysian statistics.
The two differ in the interpretation, for example take the maximum likelihood method. Here, to estimate the unknown parameters p, you maximize the likelihood function L§.

for frequentist the likelihood is not the p.d.f of the parameters but the joint p.d.f evaluated with the data obtained in the experiment and regarded as a function of the parameters
for baysian statistics one can use the likelihood to get the p.d.f for p, once the prior of the parameters p is known (from the Bayes theorem)

We generally apply this method within the frequentist statistics, reporting at the end a parameter estimate and an error interval, obtained from the fit and based on the likelihood function extracted from the data, and without using any prior (or subjective input) for the parameters.

Cheers

Lorenzo[/quote]

Sorry to contradict you but (e.g. speaking about Maximum likelihood
estimate, hereafter MLE) the answers given by
the two different approaches agree only when the prior probability is enough
flat in the region where the Likelihood has a maximum. For this reason the
criticism about “subjectivity” of priors is unfair because it’s well established
that looking only at the likelihood means in the Bayesian approach that priors
are always chosen flat; but there are cases when this is not true: so, it’s less
subjective a method that takes this into account that another denying priors
but that is shown to be a subcase.

Second, has no sense the sentence " once the prior of the parameters p is known". The prior is a state of knowledge about the parameter to infer. The
parameter usually has a given “unknown” value, so what does it mean that
we know it’s prior?

Third, the likelihood has no probability meaning as is always pointed out;
sometimes it’s written that “it is proportional” to a probability but to give such a
sentence one has in mind the Bayes’ theorem and a flat prior otherwise
it’s a non-sense.
Anyway, as said before, the two methods agree in the point estimation
(namely, they give the same “best” value) when the prior is too much “vague”;
but it’s also well known that, because the likelihood is not a probability
distribution, for the interval estimation there are some pathologies for
the likelihood approach (see, the “Optional Stopping” or “Stopping Rule
Paradox”) while in the Bayesian approach they are absent.
To not mention the beauty of the Automatic Ockham’s Razor and others.

Sorry to have started this thread that now is really off-topic but the paper
cited was not only about MLE or Least Squares, it cited also the Bayesian
framework and I felt the necessuty to stress this point.

That’s said everyone is free to choose anything he please.

Greets, Germano

moneta · March 14, 2006, 11:03am

I was referening as “subjective” to the fact that you need to suplly a prior in Bayesian statistics. Discussion on this, I think is more philosophical and off-topics

I cited the paper for referring more to the definition of MLE and Least Squares than as a reference to Baysian methods. Anyway, if you have further critics to the paper, I would recommend to contact directly the author,

Greeting,

Lorenzo