FillRandom and volatility

nocloud · April 16, 2010, 8:22am

Hi Rooters,
I have a question regarding FillRandom and volatility. When I use FillRandom to resample a histogram, i.e.

histrandom->FillRandom(input_hist,20000)

The resulting histogram (histrandom) is a lot less smooth than the input histogram (input_hist), i.e. it has a higher “volatility”.

If one is truly doing a random sampling, the “volatility” of the two distributions should be comparable.

Does anybody know why I am seeing this difference?

Eddy_Offermann · April 16, 2010, 1:47pm

The smoothness of your new distribution will depend on the number
of fills.
Please define “volatility”.

nocloud · April 16, 2010, 5:23pm

Volatility is probably not the correct term for what I am trying to describe.

Basically, if I have a histogram of 10,000 events and I use FillRandom to resample it 10,000 times, the new histogram is always a lot more bumpy than the original histogram and I can’t understand why this is the case.

Eddy_Offermann · April 16, 2010, 6:24pm

Was the original (first) histogram also created with
a random fill from the same distribution ?

nocloud · April 16, 2010, 9:42pm

no, the original histogram was created by filling a TH1F after application of selection cuts.

i should also clarify that when i said resample 10000 times in my last post, it means to do something like this:

histrandom->FillRandom(original_hist,10000)

Eddy_Offermann · April 17, 2010, 2:11pm

So, the second histogram has been created by pulling 10000
random events according to some distribution while the first
one is created by some other (hopefully random) process.
why do you expect the statistical fluctuations to be the same.

Imagine creating a histogram with a Gaussian shape with a
SetBinContent calls, so very smooth. Now create a second histogram
using FillRandom with that first one …

nocloud · April 17, 2010, 6:53pm

So, I performed the following test.

Instead of putting the 10,000 events of the original histogram into a TH1F, I put them into an array of 10,000 floats. Then, I picked a integer between 0 and 10,000 using an uniform distribution (all number between 1 and 10,000 equally likely). The code in ROOT looks something like the following:

TRandom3 randgen(time(0));
for(int ii = 0; ii<10000; ii++)
{
int randomindex = (int)(randomgen.Rndm()*10000);
float variable = input_array.at(randomindex);
random_hist->Fill(variable);
}

Essentially, I am picking a number between 0 and 10,000 and using this integer as an index to pull values from the 10,000 element array holding the input distribution. I do this 10,000 times to randomly resample the input distribution.

When, I do my resampling this way, the random_hist that is generated better preserves the smoothness of the original distribution. Shouldn’t FillRandom be effectively trying to replicated the steps I have just manually performed?

brun · April 18, 2010, 3:38pm

Could you post x xxx.root file containing your source histogram and a short script showing your problem?
Could you specify which version of ROOT?
In case you use something older than 5.26, please try with this version.

Rene

nocloud · April 28, 2010, 11:26am

i was busy with several projects last week so i never got the opportunity to look at this again until now.

when i wrote a macro to test out the various methods of resampling described in my above posts, it turned out that the behavior was pretty consistent from method to method. thus, there doesn’t seem to be a problem here. I’ve attached my code anyways in case it becomes useful for others.
rand4.txt (117 KB)
random_sample.C (2.31 KB)