TFractionFitter: data distribution and errors

fabrizio · March 17, 2011, 12:20pm

Dear all,
I’m trying to use TFractionFitter. I have doubts concerning the data distribution.
My data distribution is normalized to the luminosity (easy to solve).
Moreover it is not a simple event counting (dN/dx) but it’s already corrected for experimental factors (efficiency, several corrections…).
If I use this distribution (in which the bin content vary from 4 down to 1e-6) I get reasonable fit fraction but the errors are too large (always of the same order of the fractions). The errors in data and mc distributions are rather small and could not explain this.
If I scale the data and mc distributions of a large factor (1e7) to mimic the fact that the bin content is just an event counting, the errors are reasonable and in agreement with what I expect.
Is there an explanation for that? can I trust the results and the errors?

Reading the HMCMLL documentation, I understood that both data and MC should be plain counting distributions.
Is this true also for TFractionFitter? It seems this is the case because a way to submit weights for the mc shaper is foreseen. But in principle this is not explicitly required.

Thanks for any help, Fabrizio

moneta · March 18, 2011, 1:46pm

Hi,

I think TFractionFitter, since it performs a likelihood fit to the data, assumes Poisson probability for each bin.
If you have weighted bin content for each bin, your likelihood will be slightly different. You can still consider Poisson pdf for each bin, but you need to assign a weight to them. The MLE result of this will be the same but the errors will be different, they need to be correct for taking into account the weights.
I guess this in feature in principle could be added to the code.
Yes, as a solution now, you can scale the histogram values to have them approximately as bin counts. The scaled count value should be equal to the number of effective entries = square of Sum of weight / Sum of weights square

Best Regards

Lorenzo