Chi-square definition in ROOT

{
  double Bin_edges[] =
    {0., 0.05, 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4, 0.45, 0.5, 0.55, 0.6, 0.65, 0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 1.};
  double Bin_heights[] =
    {0, 0, 0, 0, 0, 1, 3, 16, 17, 14, 18, 18, 7, 5, 1, 0, 0, 0, 0, 0};
  TH1D *h = new TH1D("h", "h", sizeof(Bin_edges) / sizeof(double) - 1, Bin_edges);
  for (int i = 0; i < sizeof(Bin_heights) / sizeof(double); i++)
    h->SetBinContent(i + 1, Bin_heights[i]);
  h->Sumw2(kFALSE); h->ResetStats(); // "statistics" from bin content
  gStyle->SetOptFit();
  TF1 *f = (TF1*)gROOT->GetFunction("gaus");
  h->Fit(f, ""); // "" = Neyman chi2, "P" = Pearson chi2, "L" = log-likelihood
  // note: below it is assumed that "bin error" = sqrt("bin content")
  double chi2_N = 0.;
  for (int i = 1; i <= h->GetNbinsX(); i++) {
    double v = h->GetBinContent(i);
    if (v) chi2_N += TMath::Sq(f->Eval(h->GetBinCenter(i)) - v) / TMath::Abs(v);
  }
  std::cout << "hand calculated Neyman chi2 = " << chi2_N << "\n";
  // note: below it is assumed that "bin error" = sqrt( f("bin center") )
  double chi2_P = 0.;
  for (int i = 1; i <= h->GetNbinsX(); i++) {
    double v = f->Eval(h->GetBinCenter(i));
    if (v) chi2_P += TMath::Sq(v - h->GetBinContent(i)) / TMath::Abs(v);
  }
  std::cout << "hand calculated Pearson chi2 = " << chi2_P << "\n";
  // note: below Baker-Cousins binned log-likelihood is used (Poisson-distributed "bin content" is assumed)
  double nll = 0.;
  for (int i = 1; i <= h->GetNbinsX(); i++) {
    double vh = h->GetBinContent(i);
    double vf = f->Eval(h->GetBinCenter(i));
    if (vf) {
      nll += vf - vh;
      if (vh) nll += vh * TMath::Log(TMath::Abs(vh / vf));
    }
  }
  std::cout << "hand calculated negative log-likelihood nll = " << nll << "\n";
  std::cout << "hand calculated Baker-Cousins likelihood chi2 = " << 2. * nll << "\n";
}

Run the above macro three times, changing the fit option ("" = Neyman chi2, "P" = Pearson chi2, "L" = log-likelihood).

See also: TH1::Fit

@yamabuki-shan Always try to post a macro that “reproduces” your problem.

@moneta It seems that the “chi2” is wrongly calculated when log-likelihood is used.
Try my macro with “h->Fit(f, "L");”.
It returns “FCN=4.95327” which agrees with the “hand calculated negative log-likelihood nll”.
Then the "hand calculated Baker-Cousins likelihood chi2 = 9.90654, which agrees with “h->Chisquare(f, "L")” (same as “2 * FCN”).
However, the “f->GetChisquare()” returns “8.2006239” (which also appears in the drawn stat box).
This is neither the “Neyman chi2 = 7.94587” (same as “h->Chisquare(f, "")”) nor the “Pearson chi2 = 9.55313”.

2 Likes