I’m trying to fit a weighted dataset with RooFit, and using RooFit::AsymptoticError for the error correction.
The weights in the dataset are needed to model an efficiency (w = 1/eff, eff > 0), exactly one of the examples shown in the AsymptoticError paper.
Using the efficiency as part of the pdf would be ideal, but it’s far too slow.
However, this means that I can test the ideal case on a limited sample and compare.
Comparing the results on the same limited subset for the efficiency as function (baseline) and efficiency as weight + AsymptoticError, I see that while the values are very close, the asymptotic error is much higher than baseline, with a factor of >300 for some parameters. No errors or warnings are printed while computing the error correction. SumW2Errors kind of works, but as the paper says it can underestimate the error.
Is this expected from the AsymptoticErrors method? Or is it a bug?
Hi,
It could be a problem with the AsymptoticError. In some cases there could be some numerical issue in the current calculations. Can you maybe post your workspace/macro reproducing this problem ?
Thank you for the file, I can reproduce your problem. I see a very large difference in the error reported between Sumw2Error and Asymptotic for some parameters. Are those parameters having the large difference affecting only the pdf normalisations or are they shape parameters ?
One thing I know from the asymptotic error, that it does not work correctly for extended fit for the normalisation parameters, i.e when the derivative of the pdf with respect to these parameters is much smaller than the derivative of the extended term with respect to these parameters. In that case it is more reliable to use the sumw2Error.
Another possible problem is if the parameter values are close to the boundary, but it does not seem to be the case in your fit.
Otherwise you should use pseudo-experiments to estimate the uncertainties, but it will take a lot of CPU
What do you mean exactly by normalisation parameters?
Of the worst parameters, A_0 and A_pe are amplitudes, coefficients in a RooRealSumPdf.
Others (gamma e dGam) are shape parameters (gamma = decay width).
Even in parameters where the difference is not > 100, we cannot afford factors much larger than 1…
The fit is not extended (yet, it will be in the future). Is the problematic behaviour expected anyway?
Do you have a link with more in depth explaination on when the asymptotic error is known to fail?
Is it a theoretical problem of the method or an issue with the RooFit implementation?
Could it actually be numerical issues, like you mentioned in the first message?
We are going to try also pseudoexperiments, even if as you say it will take a lot of CPU.
The issue I am referring to is when the fit is extended and a parameter affects only the extended term. In that case the asymptotic approximation breaks because it is like having a single measurement. But in your case if the fit is not extended then it should be fine.
I have not seen any particular issue with the fit, I will try to make some more tests to check if the issue is in the derivative calculations used by the asymptotic.
It is an interesting case to study further, but it will take some time