Propagating poisson errors for empty bins when doing a ratio

pdudero · October 12, 2015, 8:42pm

Greetings,

I would imagine that this use case comes up quite often:
You are plotting data against MC, and you want to take a ratio. The data is usually integer, and for low event counts (e.g. high tails), the data takes on asymmetric, poisson errors. Even empty bins can have a nonzero error bar in this case. The MC, however, is weighted (non-integer) and typically has sufficient statistics for errors to be approximated normally, i.e., sqrt(sum(wt^2)), and there should be no empty bins where there is data. Fractions of a count, maybe, but not empty.

My review committee wants me to display error bars in the ratio plot where there were empty bins in the data (e.g., a marker at 0 with a nonzero length error bar). However, I don’t believe ROOT supports this automatically; I would be happy to be proven wrong. The first thing the ROOT documentation tells you to do is to invoke sumw2() before the Divide() if you plan on using the errors. At least with drawing, sumw2 inhibits the display of poisson error bars even with the kPoisson error option set.

Assuming uncorrelated errors, the formula for the error on a ratio is
d(x/y) = (x/y) * sqrt( sqr( dx/x ) + sqr ( dy/y ) )
= sqrt ( sqr( dx/y ) + sqr( xdy/y^2 ) )
= dx/y for x=0
…where dx is the poisson error of a count of zero in data, and y is the value of the MC in that same bin.
Is this a correct analysis? Ought this to be implemented?
-Phil D.

moneta · October 13, 2015, 9:05am

Hi,

The formula you write is correct of you have normal errors for both data and MonteCarlo. Since the data are Poisson errors (he MC can be probably assumed having normal errors), I think it is more complicated.
I would need to think about what is the better formula for that error.
The normal approximation that you have written you can get it in ROOT with TH1::Divide, but you would need to set the errors on the data histogram yourself using SetBinError to be the Poisson error.

Best Regards

Lorenzo

pdudero · October 13, 2015, 8:02pm

Thanks Lorenzo.

We are also referring the question to our own statistics committee.

Regarding what’s already in ROOT, you said:

However, the data histogram ALREADY has the Poisson error bit set. The problem is that when you invoke TH1::Divide(TH1 *h1, c1=1), one of the first checks performed is whether h1 has the sumw2 bit set. In my case it does. THen Sumw2() is invoked on “this”, which fills the error array with the existing bin contents, essentially causing the error option to be ignored.

This means, as far as I can tell, that not only are the empty bins not treated properly, but NONE of the bins of the ratio are treated properly, as far as propagating Poisson errors are concerned.

moneta · October 14, 2015, 1:19pm

Hi,

Yes, you are right, since you have histogram with Sumw() (with weights) the Poisson error option is ignored. You would have then in this case to set yourself (using SetBinError) the bin error you retain appropriate for the histograms.
However, using error propagation is just a crude approximation, which I don’t think is correct in this case of low statistics.
I would estimate the error from the poisson ratio, which is available in root in TGraphAsymErrors::Divide with the option “pois”.

Lorenzo

rebassoo · November 26, 2015, 2:36am

Has there been any progress made or plans to implement the poisson error for ratio plots?
We are having exactly the same issue in our CMS analysis, where we have been asked to place error bars in the data/MC ratio plot for bins with 0 data entries.

I am also a bit confused about the suggestion to use the TGraphAsymErrors::Divide option. From what I understand this function assumes that the entries in “pass” are a subset of those in “total”. However, in a ratio plot with data/MC, one is not a subset of the other and this ratio can be larger than 1. So it doesn’t seem to me that this function could be used.

Thanks,
Finn

pdudero · October 26, 2016, 8:40pm

This talk (last December)
indico.cern.ch/event/458475/con … amming.pdf
reported that “pois” and “pois midp” options were to appear for testing in ROOT 6.06. Has this capability been sufficiently vetted now?

Thanks,
-Phil D.