Hello all
I trying to use Chi2Test function in order to compare two histograms. Before using in production I tried to play with the function but I am stuck when I compare two histograms representing gaussian distribution.
I have done the following test :
TH2D FGHist("First guassian hist","First guassian hist",10,-2.0,2.0,10,-2.0,2.0) ;
TH2D SGHist("Second guassian hist","Second guassian hist",10,-2.0,2.0,10,-2.0,2.0) ;
double Error = 0.05 ;
bool res = false ;
FGHist.FillRandom("gaus",100000) ;
SGHist.FillRandom("gaus",100000) ;
res = CompareHisto(FGHist,SGHist,Error) ;
CompareHisto is simply calling the chi2test function with the following parameters:
Histo1.Chi2Test(&Histo2,“UU P”,pRes) ;
and I have the following result:
Chi2 = 97.601921, Prob = 0.520884, NDF = 99, igood = 0
I am very surprised, I was expecting a p-value much closer to 1.
I have a similar result with the kolmogorov test :
Kolmo Prob h1 = First guassian hist, sum1=100000
Kolmo Prob h2 = Second guassian hist, sum2=100000
Kolmo Probabil = 0.507416, Max Dist = 0.00368
I also tried to set the bincontent error (as example : https://root.cern.ch/root/html/tutorials/math/chi2test.C.html ) but I got the same result.
SO, one of my coworker tells me to normalize histograms using the number of events :
TH2D FGHist("First guassian hist","First guassian hist",10,-2.0,2.0,10,-2.0,2.0) ;
TH2D SGHist("Second guassian hist","Second guassian hist",10,-2.0,2.0,10,-2.0,2.0) ;
double Error = 0.05 ;
bool res = false ;
FGHist.FillRandom("gaus",100000) ;
SGHist.FillRandom("gaus",100000) ;
FGHist.Scale(double(1)/double(100000));
SGHist.Scale(double(1)/double(100000));
for (unsigned int i = 0 ;i< FGHist.GetNbinsX();i++) {
for (unsigned int j = 0 ; j< SGHist.GetNbinsY();j++) {
FGHist.SetBinError(i,j,sqrt(FGHist.GetBinContent(i,j))) ;
SGHist.SetBinError(i,j,sqrt(SGHist.GetBinContent(i,j))) ;
}
}
res = CompareHisto(FGHist,SGHist,Error) ;
Using this method I got the expected result for two histograms having the same distribution.
Then I tested this method on two histograms having different distributions but CHi2Test tells me that my histogram are still similar. For example:
TH2D FGHist("guassian hist","guassian hist",10,-1.0,1.0,10,-1.0,1.0) ;
TH2D SGHist("uniform hist","uniform hist",10,-1.0,1.0,10,-1.0,1.0) ;
double Error = 0.05 ;
bool res = false ;
FGHist.FillRandom("gaus",10000) ;
SGHist.FillRandom("pol0",10000) ;
FGHist.Scale(double(1)/double(10000));
SGHist.Scale(1.0/SGHist.Integral());
for (unsigned int i = 0 ;i< FGHist.GetNbinsX();i++) {
for (unsigned int j = 0 ; j< SGHist.GetNbinsY();j++) {
FGHist.SetBinError(i,j,sqrt(FGHist.GetBinContent(i,j))) ;
SGHist.SetBinError(i,j,sqrt(SGHist.GetBinContent(i,j))) ;
}
}
res = CompareHisto(FGHist,SGHist,Error) ;
I have the following output:
Chi2 = 0.021500, Prob = 1, NDF = 99, igood = 3
I think this is really strange because both histograms have different distributions.
Could you please help me to understand what is wrong with my test ?
Thank you very much for your help
Regards
clemr