Unrealistivc Chisq values in minimisation

kaur · December 13, 2020, 2:48am

Hi
I am fitting a function for validation of a data set. The function has two fit parameters and one normalisation constant.
The fit results are as shown:

root [0]
Processing combine_NBD_epH1.c…
FCN=0.52839 FROM MINOS STATUS=SUCCESSFUL 20 CALLS 251 TOTAL
EDM=8.3977e-07 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 c 1.00068e+02 4.05848e+00 -1.43970e-01 5.74629e-04
2 k 3.22198e+00 5.35564e-01 9.82407e-03 4.29025e-03
3 2.45923e+00 8.90937e-02 8.90937e-02 2.63939e-04
Info in TCanvas::Print: ps file has been created

The question is:
The fit parameters look fine and match the published results. BUT the chisq/ndf value is 0.52839/11 which is extemely low and looks unrealistic. The published balue is ~24/13.
I fail to find the reason.
Any help is appreciated.
Thanks

Dilicus · December 14, 2020, 7:36am

Hi,
can you provide a reproducer and a link to the published results? In this way is easier to help you.

For now I can notice that you have 11 degree of freedom instead of 13, what is the reason?
Maybe you have two missing point and they affect the chi square?

Cheers,
Stefano

kaur · December 15, 2020, 5:43am

Thanks a lot for addressing my problem. I am attaching the data file and the published result. Also the paper reference. ( (https://arxiv.org/abs/hep-ex/9608011v1) Data is given in Table 3 and the foit results are given in Table 5 opf the paper.
I sent the results for the first data set. I am also not sure how they get ndf =13 ?

1<η∗ <2

15.79 ± 0.52 ± 2.18
22.55 ± 0.50 ± 1.92
20.62 ± 0.46 ± 1.24
15.96 ± 0.43 ± 2.29
10.21±0.37±1.06
6.07 ± 0.30 ± 1.54
3.85 ± 0.26 ± 0.48
2.24 ± 0.16 ± 0.62
1.15 ± 0.12 ± 0.42
0.68 ± 0.10 ± 0.31
0.49 ± 0.12 ± 0.24
0.19 ± 0.06 ± 0.26
0.07 ± 0.04 ± 0.13
0.05 ± 0.03 ± 0.06

Thanks and regards
Kaur

kaur · December 15, 2020, 5:45am

Sorry I forgot to add the fit results from the paper.
= 2.52 ± 0.10
1/K = 0.285 ± 0.080
chisq/ndf = 29.4/13

You can find these in the paper.

kaur · December 15, 2020, 7:02pm

My program and a data file in which systematic and statistical errors are added in quadrature.

#include “TCanvas.h”
#include “TROOT.h”
#include “TMath.h”
#include “TGraphErrors.h”
#include “TF1.h”
#include “TLegend.h”
#include “TArrow.h”
#include “TLatex.h”

using namespace std;
void Validate_NBD_epH1()
{

gStyle->SetOptFit(1);
// data files
const char *file1[4] = {“t1-epH1”, “t2-epH1”,“t3-epH1”,“t4-epH1”};

const double file2[4] = {0,0,0,0}; // min. value of x
const double file3[4] = {13,17,18,18}; // max. value of x

// Initialising the parameter values: normalisation, K and

const double file4[4] = {90.0,90.0,90.0,90.0};
const double file5[4] = {3.0,2.0, 3.0, 4.0};
const double file6[4] = {2.0, 2.0,2.0,2.0};

const char *file7[16] = {“t1-epH1”,“t2-epH1”,“t3-epH1”,“t4-epH1”}; // output fnames

char name1[200]; char name2[200]; char name3[200];

for(int i=0;i<4;i++){

sprintf(name1, "%s.txt", file1[i]);
TGraphErrors graph(name1);

graph.SetTitle("Charged multiplicity Distribution for ep-H1 NBD)");
graph.SetMarkerStyle(kOpenCircle);
graph.SetMarkerColor(kBlue);
graph.SetLineColor(kBlue);
//graph.GetYaxis()->SetMaximum(0.15);

// Function NBD
TF1
f(“f”,"[0]*((TMath::Gamma(x+[1])*TMath::Power(([2]/[1]),x))/(TMath::Gamma(x+1)*TMath::Gamma([1])*TMath::Power((1+([2]/[1])),x+[1])))",file2[i], file3[i]);

f.SetParNames("c","k","<n>");

// f.SetParameter(0, file4[i]); // c (normalization constant)
// f.SetParameter(1, file5[i]); // k
// f.SetParameter(2, file6[i]); //

  f.SetParameter(0, 90.0); // c (normalization constant)
  f.SetParameter(1, 2.0); // k
  f.SetParameter(2, 2.0); // <n>


graph.Fit(&f, "ME");

TCanvas* c1 = new TCanvas();

// sprintf(name2, “PNG/%s.png”, file7[i]);
sprintf(name3, “PDF/%s.pdf”, file7[i]);
graph.DrawClone(“APE”);
f.DrawClone(“Same”);

c1->SaveAs(name2);
c1->SaveAs(name3);

cout<<""<<endl;
cout<<""<<endl;

}
}

Data File

// N PROB ERX ERY
0.00 15.7900 0.00 2.2412
1.00 22.5500 0.00 1.9840
2.00 20.6200 0.00 1.3226
3.00 15.9600 0.00 2.3300
4.00 10.2100 0.00 1.1227
5.00 6.0700 0.00 1.5689
6.00 3.8500 0.00 0.5459
7.00 2.2400 0.00 0.6403
8.00 1.1500 0.00 0.4368
9.00 0.6800 0.00 0.3257
10.00 0.4900 0.00 0.2683
11.00 0.1900 0.00 0.2668
12.00 0.0700 0.00 0.1360
13.00 0.0500 0.00 0.0671

kaur · December 15, 2020, 7:03pm

Data File
// N PROB ERX ERY
0.00 15.7900 0.00 2.2412
1.00 22.5500 0.00 1.9840
2.00 20.6200 0.00 1.3226
3.00 15.9600 0.00 2.3300
4.00 10.2100 0.00 1.1227
5.00 6.0700 0.00 1.5689
6.00 3.8500 0.00 0.5459
7.00 2.2400 0.00 0.6403
8.00 1.1500 0.00 0.4368
9.00 0.6800 0.00 0.3257
10.00 0.4900 0.00 0.2683
11.00 0.1900 0.00 0.2668
12.00 0.0700 0.00 0.1360
13.00 0.0500 0.00 0.0671

Dilicus · December 15, 2020, 11:17pm

Sorry for the late reply, it was a busy day.

In your NDB you approx p as k/,
So the plus in this formula should be a minus, am I right?

TF1
f(“f”,“[0]*((TMath::Gamma(x+[1])*TMath::Power(([2]/[1]),x))/(TMath::Gamma(x+1)*TMath::Gamma([1])*TMath::Power((1 + ([2]/[1])),x+[1])))”,file2[i], file3[i]);

In any case I suggest you to have a look to the PDF defined in ROOT there is also the NBD and your TF1 becomes simply

TF1 f("f","[0]*ROOT::Math::negative_binomial_pdf(    x,[2]/[1],x+[1])",0,13);

Also in this case the chisq of the fit is very different from the one shown in the publication. And I really cannot figure out how they count their NDF.
Given that for 1<eta*<2 and 150< W<185 they have 15 points and also 15 NDFs, I suspect they remove some points for the fit or in the chisq calculation.

Stefano

kaur · December 16, 2020, 5:17am

Thank you very much for debugging the program. But even with the ROOT defined NBD, the chisq value remains nearly what we got earlier. So I should trust the value given by the ROOT! Some referees of the papers do not agree with such low values. And that is why I wanted to validate. May be I should check with data from another experiment.
Thanks again and best regards

Dilicus · December 16, 2020, 6:47am

Hi,
Maybe the NDB used in the paper is slightly different from the one used in the paper, this can have an impact on the chisq, but I did not tried this function.
As last remark, removing the point at 0 give a chisq more similar to the one of the paper.

Best regards

kaur · December 18, 2020, 11:56pm

Thanks a lot for your time and advise. The chisq values for some of the data sets agree well with the published data. But certainly not for many other sets. There 16 data sets altigether. I implemented the root defined NBD. Thanks for the suggestion. However, I also checked with the NBD function from the published paper. The results are the same.
So I May be the published data set is different from the one used for fitting!!
Thanks and regards

system · January 1, 2021, 11:56pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.