This is not really a question, but rather considerations I came across and think it should be wise to share for the records, to prevent others from the same issues.
I need to fill an histogram with a 1/x distribution, say within the [1e3;1e10] range. Unfortunately, things do not seem to work expectedly.
Here is my function :
TF1* f=new TF1("f","1./x",1e3,1e10);
Now I define three histograms in the same range, with 1000 bins of variable length so that they appear as having the same length in log scale :
const int nbins = 1000; float scale[nbins+1]; float xmin = 1e3; float xmax = 1e10; float step = (log10(xmax)-log10(xmin))/nbins; scale = xmin; for (int i=1; i<nbins+1; i++) scale[i] = pow(10,log10(scale[i-1])+step); TH1D* h1 = new TH1D("h1","h1",nbins,scale); TH1D* h2 = new TH1D("h2","h2",nbins,scale); TH1D* h3 = new TH1D("h3","h3",nbins,scale);
Now let’s fill these histograms, first with the FillRandom method, then with GetRandom, and then with GetRandom again after having set the number of points of f to its maximum, that is 100000 :
h1 -> FillRandom("f",1e7); for (int i=0; i<1e7; i++) h2->Fill(f->GetRandom()); f->SetNpx(100000); for (int i=0; i<1e7; i++) h3->Fill(f->GetRandom());
The distribution are shown in the first attached plot (randomissue.png). I know this issue is known (see for instance here), but can’t anything be done against that ? It is not necessarily obvious when you fill histograms with a smaller number of bins, but can modify significantly the results…
Moreover, I read “TH1::Fillrandom evaluates the function in the center of the histogram bin, that is less precise than the method in TF1::GetRandom” (here), which suggests that none of these methods is very precise… Is there any workaround ?
Now, let’s say I want to check by comparing with a fixed bin width histogram. Again, two methods :
TH1D* hh1 = new TH1D("hh1","hh1",10000,1e3,1e10); TH1D* hh2 = new TH1D("hh2","hh2",10000,1e3,1e10); hh1 -> FillRandom("f",1e7); for (int i=0; i<1e7; i++) hh2->Fill(f->GetRandom());
And again, two different results, see second attachment (randomissue2.png).
Then, I Fill a new histogram (fixed bin size) with the content of the first one (variable bin size). I consider only the one obtained with FillRandow, as “by eyes” it seems the only one usable :
TH1D* hh3 = new TH1D("hh3","hh3",10000,1e3,1e10); for (int i=0; i<1000; i++) hh3->Fill(h1->GetBinCenter(i+1),h1->GetBinContent(i+1));
According to this new histogram, it seems that the good fixed bin one is the one filled by GetRandom (see third plot, randomissue3.png)…
So it seems that GetRandom should be avoided when filling a variable bin sized histogram, and FillRandom avoided when filling a fixed bin sized histogram. Note that it may seem coherent with the fixed binning of TF1, when you think about it twice.
Any thought ?