Randomness Test of random numbers following the Landau distribution

federica · September 21, 2019, 4:15pm

Hi all,

I made a simple c++ code generating some random numbers following a Landau distribution. Now, I want to test the randomness of my generator, but I’m having some doubts about how to proceed.

I thought I might generate two samples, from two different generators created with two different seeds and then compare them with some tests, like the chi-square test or the Kolmogorov-Smirnov test: the randomness test should succeed if I get a p-value which is uniformly distributed between 0 and 1. Does it make sense?

Here’s my code with a few comments:

// c++ -o  es1 es1.cpp `root-config --cflags --glibs`

#include<iostream>
#include<fstream>
#include<cstring>
#include<math.h>
#include<cmath>
#include<TMath.h>
#include<TCanvas.h>
#include<TGraph.h>
#include<TH2.h>
#include<TF1.h>
#include<TApplication.h>
#include<TGraphErrors.h>
#include<TMultiGraph.h>
#include<TLegend.h>
#include<TStyle.h>
#include<random>
#include <algorithm>

using namespace std;

int main(int argc, char ** argv){

	gStyle -> SetOptFit(1111);

	TApplication *gr = new TApplication("grafica", 0, NULL);
 	TCanvas *c = new TCanvas("histo", "histo");

	//Range of the Landau distribution
	double x_min = -5;
	double x_max = 20;

	TH1F * h = new TH1F("histo", "", 100, 0., 1.);
	TH1F * firstHisto = new TH1F("first sample", "", 1000, x_min, x_max);
	TH1F * secondHisto = new TH1F("second sample", "", 1000, x_min, x_max);

	//random_service will be used to obtain a seed
	random_device rd;  

    	mt19937 gen(rd()); 
	mt19937 gen2(rd()+1); 

	//Random numbers between 0 and 1 
    	uniform_real_distribution<> dis(0., 1.);

	//Maximum of the Landau distribution, which is at  x = -0.22278 for the default parmeters 
        //(mu = 0 and c = 1)
	double func_max = TMath::Landau(-0.22278, 0, 1);

	vector<double> firstSample;
	vector<double> secondSample;

	//Number of random numbers
	int N = 1000000;

	for (int iRep = 0; iRep < 1000; ++iRep) { 

		//Generate the first sample
		for (int i=0; i < N; ++i) {

			//Random number between 0 and 1 
			double num = dis(gen);

			//Random number between x_min and x_max
			double var = x_min + num * (x_max - x_min);

			//Landau distribution for this random number
			double dist = TMath::Landau(var, 0, 1);

			//Random number between 0 and the function max
			double num2 = dis(gen)*func_max;

			//Hit-miss method
			if (num2 < dist) {

				firstSample.push_back(var);
				firstHisto -> Fill(var);
			} 
		}

		//Generate the second sample
		for (int j=0; j < N; ++j) {

			double num = dis(gen2);
			double var = x_min + num * (x_max - x_min);

			double dist = TMath::Landau(var, 0, 1);
			double num2 = dis(gen2)*func_max;

			if (num2 < dist) {

				secondSample.push_back(var);
				secondHisto -> Fill(var);
			} 
		}

		/*
		 | **************************
		 | Test di Kolmogorov-Smirnov
		 | **************************
		*/

                //The elements in the two vectors must be in ascendent order
		sort(firstSample.begin(), firstSample.end());
		sort(secondSample.begin(), secondSample.end());

		int m = firstSample.size();
		int n = secondSample.size();

		double firstVector[m];
		double secondVector[n];

		for (int iVec=0; iVec<m; iVec++) {
		    
			firstVector[iVec] = firstSample[iVec];
		}

		for (int jVec=0; jVec<n; jVec++) {

			secondVector[jVec] = secondSample[jVec];
		}

		//Make the test 
		double test = TMath::KolmogorovTest(m, firstVector, n, secondVector, "");

		//h -> Fill(test); 

		firstSample.clear();
		secondSample.clear();

		/*
		 | ***************
		 | Chi-square Test
		 | ***************
		*/

		double chiResult = firstHisto -> Chi2Test(secondHisto);
		h -> Fill(chiResult);

		firstHisto -> Reset();
		secondHisto -> Reset(); 
	}
	
    	h->Draw();
	
	gr -> Run();

	return 0;
}

Axel · September 23, 2019, 8:59am

That would test whether your generator reproduces the Landau distribution, but no whether the generator is a good one - what people call the “randomness” or “quality”. For the latter there is a famous, dedicated test suite, see https://webhome.phy.duke.edu/~rgb/General/dieharder.php

But maybe that’s not what you actually want to do and I’m misunderstanding which problem you try to solve?

federica · September 28, 2019, 10:18am

Hi Axel, thanks a lot for your reply.

I think the diehard tests are used to test randomness in uniform distributions. The problem here is that I need to test randomness for some numbers following the Landau distribution.

So I thought I might use two different generators, with two different seeds, in order to generate two separated distributions. Then I can compare them with the chi-square test or the KS test, and I expect to be able to conclude that the two datasets follow the same distribution.

I thought that might be a way to test my generator because, if everything is working properly, the distribution it generates should be independent from the seed.

I’m not comparing the generated numbers with the theoretical Landau distribution, but the two generated datasets among themselves.

Axel · September 28, 2019, 12:13pm

Then you’re testing how well they reproduce a distribution. That’s of course a totally reasonable goal, but not a measure of the quality of the PRNG. Think of a “random number generator” that produces numbers by scanning the Landau distribution from the lower to the upper end: you can predict the numbers, yet it reproduces the Landau distribution perfectly.