Very large errors when using RooFit::AsymptoticError

elusian · October 17, 2022, 1:10pm

Hi all,

I’m trying to fit a weighted dataset with RooFit, and using RooFit::AsymptoticError for the error correction.

The weights in the dataset are needed to model an efficiency (w = 1/eff, eff > 0), exactly one of the examples shown in the AsymptoticError paper.
Using the efficiency as part of the pdf would be ideal, but it’s far too slow.
However, this means that I can test the ideal case on a limited sample and compare.

Comparing the results on the same limited subset for the efficiency as function (baseline) and efficiency as weight + AsymptoticError, I see that while the values are very close, the asymptotic error is much higher than baseline, with a factor of >300 for some parameters. No errors or warnings are printed while computing the error correction.
SumW2Errors kind of works, but as the paper says it can underestimate the error.

Is this expected from the AsymptoticErrors method? Or is it a bug?

Thank you in advance.

moneta · October 17, 2022, 9:30pm

Hi,
It could be a problem with the AsymptoticError. In some cases there could be some numerical issue in the current calculations. Can you maybe post your workspace/macro reproducing this problem ?

Thank you,

Lorenzo

elusian · October 18, 2022, 11:15am

Here is a file containing the workspace and a macro to use it with the right fit parameters

// call as 
// root -q -b workspace.root refit_asympterr.C -- fitWs
void refit_asympterr(RooWorkspace* ws) {
    using namespace RooFit;
    auto model = ws->pdf("fit_model_MCDG0_2017_jpsimu");
    auto data = ws->data("data_MCDG0_2017_jpsimu_weighted");
    
    auto fitRes = model->fitTo(*data,
        Range("range_jpsimu"),
        Save(true),
        Offset(true),
        AsymptoticError(true),
        NumCPU(10)
    );
    fitRes->Print("V");
}

I’m running this from the dev4 LCG view.

Thank you,
Enrico

moneta · October 20, 2022, 10:43am

Hi,

Thank you for the file, I can reproduce your problem. I see a very large difference in the error reported between Sumw2Error and Asymptotic for some parameters. Are those parameters having the large difference affecting only the pdf normalisations or are they shape parameters ?
One thing I know from the asymptotic error, that it does not work correctly for extended fit for the normalisation parameters, i.e when the derivative of the pdf with respect to these parameters is much smaller than the derivative of the extended term with respect to these parameters. In that case it is more reliable to use the sumw2Error.
Another possible problem is if the parameter values are close to the boundary, but it does not seem to be the case in your fit.
Otherwise you should use pseudo-experiments to estimate the uncertainties, but it will take a lot of CPU

Best,

Lorenzo

elusian · October 21, 2022, 2:50pm

Hello, thank you for looking into this.

What do you mean exactly by normalisation parameters?
Of the worst parameters, A_0 and A_pe are amplitudes, coefficients in a RooRealSumPdf.
Others (gamma e dGam) are shape parameters (gamma = decay width).
Even in parameters where the difference is not > 100, we cannot afford factors much larger than 1…

The fit is not extended (yet, it will be in the future). Is the problematic behaviour expected anyway?

Do you have a link with more in depth explaination on when the asymptotic error is known to fail?
Is it a theoretical problem of the method or an issue with the RooFit implementation?
Could it actually be numerical issues, like you mentioned in the first message?

We are going to try also pseudoexperiments, even if as you say it will take a lot of CPU.

Best,
Enrico

moneta · October 21, 2022, 3:39pm

Hi,

The issue I am referring to is when the fit is extended and a parameter affects only the extended term. In that case the asymptotic approximation breaks because it is like having a single measurement. But in your case if the fit is not extended then it should be fine.
I have not seen any particular issue with the fit, I will try to make some more tests to check if the issue is in the derivative calculations used by the asymptotic.
It is an interesting case to study further, but it will take some time

Best,

Lorenzo

elusian · November 2, 2022, 11:58am

Hello
Does it make sense to open an actual issue in the repository? Here in the forum the discussion will be locked.
Cheers,
Enrico

moneta · November 4, 2022, 10:46am

Hi,

There are already these 2 issues open for the Asymptotic errors:

[ROOT-10827] Missing contribution of extended term in the error Correction for extended weighted likelihood fits - SFTJIRA
[ROOT-10866] Numerical instabilities when calculating the derivatives for the asymptotically correct erros - SFTJIRA

but I can open a new one linking the previous 2 and adding your example code

Thanks

Lorenzo

elusian · November 4, 2022, 11:32am

Hello and thank you.

The old link to the workspace expired, so here’s a new one in case it is needed for debug.
workspace.root

Cheers,
Enrico

system · November 18, 2022, 11:33am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.