Problem with the Kolmogorv-Smirnov probabilty distribution

bbeltran · November 8, 2007, 10:15pm

Hi

I have read in the ROOT documentation that the returned KS probability is a uniform distribution between 0 and 1 for unbinned data. I have written a small code to test that, but the probability distribution that I get is not uniform at all.
Am I doing something wrong?

Thanks

Berta

#include
#include
using namespace std;

void RootKSTest() {

Int_t n=1000;

TF1* func=new TF1(“func”,“exp(-0.5*((x)/2)^2)”, -4, 4);
Double_t a[1000];
Double_t b[1000];
Double_t ks_distance;
Double_t ks_probability;
TH1F* kSP=new TH1F(“kSP”, “KS proability distribution”, 50, 0, 1);
TH1F* kSD=new TH1F(“kSD”, “KS distance distribution”, 50, 0, 0.1);

for (int i=0; i<n; i++) {
a[i]=func->GetRandom();
}

sort(a, a+n);

for(int j=0; j<10000; j++){

for (int i=0; i<n; i++) {
b[i]=func->GetRandom();

}
sort(b, b+n);

ks_distance=TMath::KolmogorovTest(n,a,n,b,“M”);
ks_probability=TMath::KolmogorovTest(n,a,n,b,"");
kSD->Fill(ks_distance);
kSP->Fill(ks_probability);
}

TCanvas* c1=new TCanvas();
c1->Divide(1,2);
c1->cd(1);
kSD->Draw();
c1->cd(2);
kSP->Draw();

}

moneta · November 12, 2007, 9:09pm

Hello,

in your example you are keeping the a[] data fixed and varying only the b[]. Doing this you are introducing a bias and therefore your obtained probabilities are not uniform between 0 and 1.
If you re-generate also the a[] data for every j you will get a uniform probability. However, bare in mind that the method to determine the probability is rather poor for values of p larger than 0.1. You will see in fact deviations from uniformity larger than expected.

Best Regards

Lorenzo