Fitting histogram with a Binomial Distribution

Massimo_Girola · November 20, 2020, 11:25pm

I am trying to fit an histogram which is supposed to follow a Binomial distribution.
I know the number of trials (i.e. the n parameter of the Binomial distribution) and I have a very good initial value for the probability (parameter p of the Binomial distribution).
In particular I have: n = 1000, p = 0.994 (Notice that gaussian approximation is not good in this case).

I am using pyROOT

I have tried with the following code:

import ROOT
c2 = ROOT.TCanvas("c2","c2",600,400)
values = [995.0, 998.0, 997.0, 987.0, 997.0, 993.0, 994.0, 992.0, 993.0, 992.0, 994.0, 986.0, 993.0, 991.0, 995.0, 995.0, 999.0, 990.0, 995.0, 993.0, 993.0, 998.0, 995.0, 992.0, 998.0, 992.0, 994.0, 988.0, 994.0, 993.0, 994.0, 993.0, 990.0, 988.0, 998.0, 991.0, 994.0]
histo = ROOT.TH1F("Binomial","Binomial",1001-986,986-0.5,1000+0.5)
for val in values:
    histo.Fill(val) 
histo.Scale(1./histo.GetEntries())
c2.cd()
histo.Draw()
f1 = ROOT.TF1("Bin","TMath::Binomial(1000, x)*TMath::Power([0], x)*TMath::Power(1-[0], 1000-x)",0,1000)
f1.SetParameter(0,0.994)
histo.Fit("Bin")

but, even if the fit converges and the f1.GetProb() is good, the output plotted function looks not ok: it has a series of peaks over each histogram bin.
I can’t figure out how to solve it.
Thank you in advance.

_ROOT Version: 6.18/00
_Platform: Ubuntu 20.10
python, pyROOT

Massimo_Girola · November 22, 2020, 9:40am

Ok, I think I spotted the problem: x in the TF1 function is a continuous variable which does not make very much sense for a discrete binomial distribution.
So I solved the problem using:

TMath::Nint(x) //instead of x

so that it rounds up to integer value.

The corrected TF1 becomes:

f1 = ROOT.TF1("Bin","TMath::Binomial(1000, TMath::Nint(x))*TMath::Power([0], TMath::Nint(x))*TMath::Power(1-[0], 1000-TMath::Nint(x))",0,1000)

now the plot looks better than before.
I am still not sure if the fitting algorithm is working properly though.

Thank you all.

dastudillo · November 22, 2020, 10:37am

You probably want to normalise by the integral, not by entries:

histo.Scale(1./histo.Integral())

Massimo_Girola · November 22, 2020, 10:44am

Oh yeah you are right,
thanks a lot.

But now I have encountered another problem: I think that in this case the sqrt(N) errors are inappropriate and I would like to fill the histogram with:

err_val = math.sqrt(n*p*(1.-p)) #where n is #of trials and p is estimated from data themselves
histo.Fill(val,err_val)

but it seems like it does not take into account the specified weights: whatever value I write the error bars doesn’t change.
Am I doing something wrong?
Thanks again.

dastudillo · November 22, 2020, 11:02am

Search for Sumw2 in the forum or documentation. Maybe this one can help.

Massimo_Girola · November 22, 2020, 11:12am

Thanks a lot
I have solved using:

    err_Nt = math.sqrt(1000*p_expected*(1.-p_expected))
    histo.Sumw2()
    for Nt in values:
        histo.Fill(Nt)
        histo.SetBinError(Bern.FindBin(Nt), err_Nt)

Now everything seems to work properly.
Next step will be to try to do the same thing but using asymmetric errors, but that is another story.

Thanks a lot again.

system · December 6, 2020, 11:16am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.