# TFractionFitter Uncertainty Treatment?

Dear All,

I am using TFractionFitter to estimate different process fractions contributing to a specific decay and have a few of questions how uncertainties on the input histograms are treated.

1. When I do 1000 pseudo experiments, adding poisson noise to fit templates and data, the resulting pulls have a consistently narrow distribution (significantly less than the expected unity). This would indicate to me that the uncertainties are overestimated. In the extreme case where the main contribution (>95%) is from the signal I want to fit and the remaining small background fractions are fixed, this factor is actually as large as 5 (meaning a sigma of the pulls of 0.2). Is this expected or known behavior? (I know that my second example is not realistic, since there is only one free fraction, I would expect correct uncertainties nevertheless).

2. There seems to be a bias depending on the upper and lower limit of the histograms. According to the documentation the only assumption are poission statistics in the overall template counts and much lower counts in all bins. Is it correct to assume that therefore empty bins do not matter? Similarly, does it matter if there are templates with very low counts but where the fractions have been fixed?

3. From previous questions I see that the histogram entries are treated as raw counts. However, we have event-by-event weight factors to correct the MC. Is there a way to integrate those into the fit and its uncertainties?

1 Like

Hi,

I think Lorenzo @moneta will be able to help…

Cheers, Bertrand.

Hi,

My knowledge was that the errors are under-estimated, see for example

https://arxiv.org/abs/0803.2711

We can try to understand better your findings (bias and under-estimation), do you have some runnable code showing this ?

Concerning your third question, bin by bin weights can be passed, by using the TFractionFItter::SetWeight function.
It would be nice to study also the uncertainties in this case

Best Regards

Lorenzo

Dear Lorenzo and Bertrand,

thanks for your quick replies. Currently my fitting code is embedded in my more complex analysis code, but I will create a script and input root file that (hopefully) recreates this effect and send it to you.

Best regards,

Anselm

That paper says they are over-estimated on page 8:

Also in cases where the Monte Carlo contribution to the error cannot be neglected, TfractionFitter [sic] overestimates the error by an amount that depends on the value of $p_1$ […]

It is also over-estimated in case the MC fluctuations are negligible.

Anyways, was this ever fixed in TFractionFitter? Because the paper does give a way to correct it, by using the Covariance matrix to propagate the error of $p_s = \hat{p}_s / \sum _s\hat{p} _s$.

I played a bit with the histograms and got the width of the distribution to go up by excluding ranges with empty bins. However, now I get a bias in my result. I attached a script and a root file with the histograms I used. The script runs the fit 2000 times, each time cloning the original histograms and adding poisson noise. In the end a histogram of the pulls is created and fit with a Gaussian.

Anselm

fractionFitterTest.tar (40 KB)

another couple questions about the use of the TFractionFitter class

1. When I use MCTemplates that are a factor 5 larger than the data I fit, is it expected that the templates I get from getMCPrediction still contain this factor 5, i.e. add up to the sum of the MC data, not the data input?

2)I seem to have the problem that the fraction that I get from comparing ratios of integrals from templates I get via getMCPrediction is not consistent with the fraction I get from ‘GetResult’. Is there any reason you can think of why this can happen? In particular does the use of Sumw2 have any impact on the fit results? I assume that only the bin contents count but I want to check with you.