SumW2Error gives unexpected results with unweighted datasets, BatchMode

Continuing the discussion from Printed error differs from stored error when using SumW2Error:

I understand from @moneta that the stored error includes the correction from SumW2Error while the printed one does not. Because of the delayed reply, I am unable to follow-up there, so I present here four follow-up questions:

  1. I think the updated error calculations should be printed. At the very least, that the stored and printed errors are expected to differ should be documented here so users do not have to track down this forum post.

  2. Why does SumW2Error have an effect on unweighted datasets? Shouldn’t it agree with the default errors in this case?

  3. Why are the differences worse when using binned data?

  4. Why are the differences worse when using BatchMode?

Thanks for your continued help.

Hi,

  1. I agree this should be probably documented better. However it is important to understand what is happening. When SumW2Error is enable a first minimization is performed of the weighted likelihood followed by a second one where the likelihood is minimized using the square of the weights. This is for obtained the matrix C, used for the correction for the Covariance matrix as documented here.
  2. It should not make a difference, because in that case the matrix C = V, where C is the covariance matrix obtained from the weight square likelihood and V from the weighted likelihood. If all the weights are 1 it is clear then that C must be equal to V.
    However, if the fit is unstable, it is possible that calling again Hesse, you get a different matrix. This is something to check for that particular problem.
  3. It should not, I suspect it is something caused by the instability of that fit
  4. Also in Batch mode, if the fit is not stable you might expect to get a different result, since the computation is performed differently, causing a different result within the numerical error

Cheers

Lorenzo

Hi, @moneta,

Thank you for the prompt responses. The diagnosis that these inconsistencies are caused by instability in the fit makes sense, but the reproducer I shared in the original post is just a Gaussian + linear Chebychev, fit to itself. If this fit is unstable, it is difficult for me to imagine what fit is stable, but perhaps this is getting off topic.

The biggest takeaway seems to be that BatchMode does not play nice with binned fits and SumW2 errors.

Thanks again.

Hi,

From the posted example I observe very small differences in the error, at the level of few % of the error value. This is within the requested tolerance of the minimization (~10^4) so I really don’t see the problem.

When using batchmode I don’t observe larger differences again using your post example.

The biggest takeaway seems to be that BatchMode does not play nice with binned fits and SumW2 errors.

This is a general strong statement that I don;t think is correct. Batchmode is still an experimental mode, and we cannot exclude there could be some issues with some type of models, but this is not general for binned fits or SumW2 errors.

Lorenzo

Hi, @moneta,

I am surprised to hear you don’t observe a large difference in a binned fit using SumW2Errors and BatchMode. Here is a more concise reproducer:

import ROOT as r

ws = r.RooWorkspace("workspace")
x = ws.factory("x[-10, 10]")
sig = ws.factory("Gaussian::sig(x, mu[-1, 1], s[0.1, 5])")
bkg = ws.factory("Chebychev::bkg(x, {c1[0.1, -1, 1]})")
shp = ws.factory("SUM::shp(Nsig[0, 200] * sig, Nbkg[0, 200] * bkg)")
data = shp.generate(r.RooArgSet(x))
datahist = r.RooDataHist("datahist", "datahist", data.get(), data)
print("with BatchMode:")
resWith = shp.fitTo(
    datahist,
    r.RooFit.Extended(),
    r.RooFit.Save(),
    r.RooFit.SumW2Error(True),
    r.RooFit.Strategy(1),
    r.RooFit.BatchMode(True),
)
print("without BatchMode:")
resWithout = shp.fitTo(
    datahist,
    r.RooFit.Extended(),
    r.RooFit.Save(),
    r.RooFit.SumW2Error(True),
    r.RooFit.Strategy(1),
    r.RooFit.BatchMode(False),
)
resWith.Print()
resWithout.Print()

With BatchMode:

  RooFitResult: minimized FCN value: 1171.58, estimated distance to minimum: 267.825
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    5.4048e+01 +/-  6.78e+01
                  Nsig    1.4599e+02 +/-  7.73e+01
                    c1    2.0586e-01 +/-  5.90e-01
                    mu    4.6270e-01 +/-  1.06e+00
                     s    3.1733e+00 +/-  1.40e+00

Without BatchMode:

  RooFitResult: minimized FCN value: -303.384, estimated distance to minimum: 2.03636e-05
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    5.4053e+01 +/-  2.23e+01
                  Nsig    1.4595e+02 +/-  2.43e+01
                    c1    2.0472e-01 +/-  3.44e-01
                    mu    4.6150e-01 +/-  4.05e-01
                     s    3.1738e+00 +/-  4.88e-01

The central values are quite similar, but the errors are very different (more than a factor of 2 greater with BatchMode than without). Is this not the output you get with this reproducer?

Hi,
I am getting this using current ROOT master:

with batch mode (no Sumw2error) and unbinned

 RooFitResult: minimized FCN value: -303.213, estimated distance to minimum: 1.31123e-05
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    5.4111e+01 +/-  2.23e+01
                  Nsig    1.4590e+02 +/-  2.42e+01
                    c1    2.0842e-01 +/-  3.37e-01
                    mu    4.6319e-01 +/-  3.93e-01
                     s    3.1799e+00 +/-  4.89e-01

without batch mode (no Sumw2error) and binned

 RooFitResult: minimized FCN value: -288.37, estimated distance to minimum: 6.15754e-05
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    1.0218e+02 +/-  1.97e+01
                  Nsig    9.7808e+01 +/-  1.96e+01
                    c1    1.4306e-01 +/-  2.00e-01
                    mu    3.5706e-01 +/-  4.62e-01
                     s    2.5477e+00 +/-  4.58e-01

Corrected:
The observed difference (Delta(NLL)=15) is because is binned vs unbinned.

The unbinned without batch mode is :

 RooFitResult: minimized FCN value: -303.213, estimated distance to minimum: 3.75962e-05
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    5.4105e+01 +/-  2.23e+01
                  Nsig    1.4593e+02 +/-  2.42e+01
                    c1    2.0955e-01 +/-  3.37e-01
                    mu    4.6425e-01 +/-  3.93e-01
                     s    3.1795e+00 +/-  4.89e-01

Which ROOT version are you using ?

Lorenzo

Hi, @moneta,

I am using 6.24/00 from the conda distribution on macOS.

HI,
I did not observe this difference with 6.24. Are you sure you are fitting on the same data and starting with the same initial condition (i.e. same parameter values) ?

Hi, @moneta,

Yes, I am running it exactly as shown in the reproducer in a fresh python session. Since this technically involves the “without BatchMode” fit starting with different initial values than “with BatchMode” (since the shape was already fit once), I also tried running each way in separate shells. The output is identical to that above.

It is very confusing that we should be getting different results for exactly the same code…

Hi,
As I said the fit is simple, but not very stable given the small statistics. You see above that results of binned vs unbinned are extremly different !

I could reproduce some difference only when using Sumw2(1) and Minuit as minimzier. If I use Minuit2 I see always the same results. Can you try using Minuit2 as minimizer ? I would recommend to use it as default

Lorenzo

Hi, @moneta,

Adding r.RooFit.Minimizer("Minuit2", "migrad") to the fitTo command results in very nearly the same results as above:

With BatchMode:

  RooFitResult: minimized FCN value: -303.383, estimated distance to minimum: 133.316
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    5.4048e+01 +/-  6.76e+01
                  Nsig    1.4599e+02 +/-  7.71e+01
                    c1    2.0586e-01 +/-  5.89e-01
                    mu    4.6270e-01 +/-  1.05e+00
                     s    3.1733e+00 +/-  1.40e+00

Without BatchMode:

  RooFitResult: minimized FCN value: -303.384, estimated distance to minimum: 1.01821e-05
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    5.4053e+01 +/-  2.23e+01
                  Nsig    1.4595e+02 +/-  2.43e+01
                    c1    2.0472e-01 +/-  3.44e-01
                    mu    4.6150e-01 +/-  4.05e-01
                     s    3.1738e+00 +/-  4.88e-01

I do not think it is just statistics. Here it is with 100x more stats.:

import ROOT as r

ws = r.RooWorkspace("workspace")
x = ws.factory("x[-10, 10]")
sig = ws.factory("Gaussian::sig(x, mu[-1, 1], s[0.1, 5])")
bkg = ws.factory("Chebychev::bkg(x, {c1[0.1, -1, 1]})")
shp = ws.factory("SUM::shp(Nsig[0, 20000] * sig, Nbkg[0, 20000] * bkg)")
data = shp.generate(r.RooArgSet(x))
datahist = r.RooDataHist("datahist", "datahist", data.get(), data)
print("with BatchMode:")
resWith = shp.fitTo(
    datahist,
    r.RooFit.Extended(),
    r.RooFit.Save(),
    r.RooFit.SumW2Error(True),
    r.RooFit.Strategy(1),
    r.RooFit.BatchMode(True),
    r.RooFit.Minimizer("Minuit2", "migrad"),
)
print("without BatchMode:")
resWithout = shp.fitTo(
    datahist,
    r.RooFit.Extended(),
    r.RooFit.Save(),
    r.RooFit.SumW2Error(True),
    r.RooFit.Strategy(1),
    r.RooFit.BatchMode(False),
    r.RooFit.Minimizer("Minuit2", "migrad"),
)

Which gives:
With BatchMode:

  RooFitResult: minimized FCN value: -121094, estimated distance to minimum: 4.95332e+11
                covariance matrix quality: Full matrix, but forced positive-definite
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    9.8942e+03 +/-  6.96e+04
                  Nsig    1.0106e+04 +/-  8.79e+04
                    c1    9.0743e-02 +/-  8.03e+00
                    mu   -6.5631e-02 +/-  2.70e+01
                     s    2.5349e+00 +/-  2.01e+01

Without BatchMode:

  RooFitResult: minimized FCN value: -121094, estimated distance to minimum: 2.56742e-06
                covariance matrix quality: Full, accurate covariance matrix
                Status : MINIMIZE=0 HESSE=0 HESSE=0 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  Nbkg    9.8941e+03 +/-  1.84e+02
                  Nsig    1.0106e+04 +/-  1.85e+02
                    c1    9.0777e-02 +/-  1.99e-02
                    mu   -6.5422e-02 +/-  4.05e-02
                     s    2.5349e+00 +/-  4.29e-02

Hi,
Yes I can finally see the problem ! When using Minuit the second Hessian computation happening when doing the Sumw2 correction returns a wrong Hessian, giving totally wrong errors.
Since there are no weights (i.e. all weights are equal to 1) one would expect that the Hessian will be the same.
It needs to be further investigated. It is strange that this is happening only with Minuit, BatchMode and binned fits.

Lorenzo

1 Like

Hi,
I have opened a GitHub issue on this, see https://github.com/root-project/root/issues/9118

Lorenzo

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.