Minimized NLL not corresponding to best chi2

Dear experts,

We have run into an issue of inaccuracies with the NLL minimum not corresponding to the best fit.

We are building a RooWorkspace using xmlAnaWSBuilder where we have a 5-parameter function fitting a histogram (10^9 entries, 2500 bins) that was generated by exactly this function, unfluctuated. We then perform the fit using quickFit, setting the initial parameters to the values that yield the optimal fit and if we fix them, a chi2 of (almost) 0 comes out. If we allow the fit parameters to float, the minimization of the NLL yields new fit parameters which indeed correspond to a lower NLL but offer a worse fit.

Our assumption is that we run into the limit of numerical accuracy of the NLL since the relative NLL difference between the best fit and true minimum is in the order of 10^-10. We have found that disabling the NLL offset in RooAbsPdf::createNLL() and switching the integrator via RooAbsReal::defaultIntegratorConfig()->method1D().setLabel("RooAdaptiveGaussKronrodIntegrator1D") improve the result, but still a difference remains.

Are there any other improvements that we can apply to get a better fit? I am attaching a plot of the residuals with the different settings in the fit and a tarball containing the workspace and a README with the commands used to build the workspace and fit it. We tested this both in ROOT 6.20 and 6.24.

Cheers,
Falk

residuals_J100_integrators
RooStatsNLL.tar (540 KB)

Hi @fbartels!

Thanks for the detailed post, I was able to reproduce the fit.

I also ran it with the ROOT master branch so see if the problem has maybe solved itself in the meantime, and indeed the agreement is much better (fixed parameters in blue and floating parameters in red).

For reference, the script I used to produce this plot after following the instructions in the README in the archive from the initial post:

import ROOT

ROOT.gROOT.SetBatch(True)

f1 = ROOT.TFile.Open("fixPars/PostFit.root")
f2 = ROOT.TFile.Open("floatingPars/PostFit.root")

c1 = ROOT.TCanvas()

h2 = f2.Get("residuals")
h2.SetLineColor(ROOT.kRed)
h2.Draw()
f1.Get("residuals").Draw("SAME")

c1.SaveAs("residuals.png")

I think with ROOT 6.26, you would also get the same result because the differences between 6.26 and master are small.

The reason for the fit being better is that the RooFit batchmode was completely re-implemented and the new implementation is seemingly less sensitive to floating point errors.

Is there a reason why you can’t use ROOT 6.26 for this fit?

Anyway the agreement is still not perfect, probably because of some remaining floating point related errors. It’s better to investigate even further to see how we can get perfect agreement, right?

Cheers,
Jonas

Hi @jonas ,

Thanks for your investigation! Did you use quickFit to perform the fit or somehting else?

I recompiled quickFit after lsetup "views LCG_102rc1_ATLAS_14 x86_64-centos7-gcc11-opt" which comes with Root 6.26/00 and RooFit 3.60. It throws a couple of warnings that RooMinuit has been replaced by RooMinimizer now, but it compiles. If I now set --useBatch 1 for quickFit, which adds BatchMode(1) to RooAbsPdf.createNLL(), the results change.

One additional parameter shows up as info from Minuit2 (the first one):

Info in <Minuit2>: MnSeedGenerator Negative G2 found - new state: 
  Minimum value : -2.173421348e+10
  Edm           : 121266.6657
  Internal parameters:
      1.542734729
     0.5235987756
     0.0316276393
     0.3459600492
     0.1197389193
    0.08797523891
  Internal gradient  :
                0
     -11152990.55
      1108.107355
      59610.54466
      -150652.755
      76293.44441

and the fit seems to abort with

MnHesse 2nd derivative zero for parameter _J100yStar06_obs_x_channel ; MnHesse fails and will return diagonal matrix

Fit results with BatchMode on:

                  nbkg    1.5000e+09 +/-  0.00e+00
                    p2    9.4867e-01 +/-  0.00e+00
                    p3    1.0173e+01 +/-  0.00e+00
                    p4    1.1945e+00 +/-  0.00e+00
                    p5    8.7862e-02 +/-  0.00e+00
                  chi2    0.03277961737166526

(It looks like the normalisation is not taken into account for RooAbsPdf.createChi2(datahist, RooFit.DataError(0))?

Fit results with BatchMode off:

                  nbkg    1.5193e+09 +/-  3.89e+04
                    p2    9.5105e-01 +/-  1.04e-02
                    p3    1.0173e+01 +/-  1.62e-03
                    p4    1.1945e+00 +/-  2.68e-04
                    p5    8.7849e-02 +/-  5.19e-05
                  chi2    0.5243252774369742

If you could let me know how you produced your good fit, that would be very helpful!

Cheers,
Falk

Anyone? @jonas ?

Hi! I will try the comparison later today myself to confirm that this is the problem, but I think the fit might be affected by a bug that got fixed the upcoming 6.26.04 patch release. Sorry for the delay in this investigation.

Cheers,
Jonas

OK, thank you!
Just wanted to make sure that this topic does not get closed from inactivity.

Hi @fbartels ,

in the meanwhile note that if @jonas is right you can test whether 6.26.04 fixes the problem by updating to the latest version or by trying it out in a conda environment, a docker container or similar. See also Installing ROOT - ROOT .

Cheers,
Enrico

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Hi @fbartels,

sorry for the late reply!

I Have produced the results with quickFit, using my custom branch that compiles with ROOT master. It is based on the ROOT_v6.24_06 in the official quickFit repo. Then I just followed the instructions from the README in your tarball, executing the quickfit command once of the floating parameter input file and once for the fixed input file.

The problem with the spurious parameter in the BatchMode that you observed was fixed recently in this commit. It was backported to 6.26.06, so if you use the most recent 6.26 patch release it should be all good.

I didn’t see a --useBatch 1 branch for quickFit though, do you use a different quickfit branch? In my branch based on the ROOT_v6.24_06 branch, the BatchMode(true) option is hardcoded in the createNLL call. So my results are indeed using the BatchMode and look fine, as far as I can tell.

I hope this helps you to continue your investigation, let me know if you need further help!

Jonas

Hi @jonas ,

Thanks for the details! In your custom branch, do you use the setup_lxplus.sh that sources /afs/cern.ch/user/a/asciandr/public/ROOT_v6.24_06/build/bin/thisroot.sh (which does not exist anymore)? Or do you instead build the latest ROOT master and source that?

I’m using the development branch of the original quickFit repo, I think you are using a fork from that. There, the --useBatch 1 can be used to set BatchMode(true) which otherwise defaults to false.

Many thanks,
Falk

Very interesting!

I used now the development branch of quickFit; my ROOT_master branch is now based on that. Indeed I can reproduce your results from your initial post now:

But when I use this other branch I had (ROOT_v6.24_06_master, based on this quickfit fork, I get almost exactly the results from my initial post:

Both times I use the same build of ROOT master. So there was something going on in the recent quickFit develompent that made the fit converge slightly worse. The fir results are very similar however. Here with the branch based on development:

  RooFitResult: minimized FCN value: -2.17343e+10, estimated distance to minimum: 0.00284159
                covariance matrix quality: Full matrix, but forced positive-definite
                Status : MINIMIZE=-1 HESSE=3 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  nbkg    1.5193e+09 +/-  3.89e+04
                    p2    9.4966e-01 +/-  1.04e-02
                    p3    1.0173e+01 +/-  1.62e-03
                    p4    1.1945e+00 +/-  2.68e-04
                    p5    8.7861e-02 +/-  5.18e-05

Here with the older branch:

  RooFitResult: minimized FCN value: -2.17343e+10, estimated distance to minimum: 0.033769
                covariance matrix quality: Full matrix, but forced positive-definite
                Status : MINIMIZE=-1 HESSE=3 

    Floating Parameter    FinalValue +/-  Error   
  --------------------  --------------------------
                  nbkg    1.5193e+09 +/-  3.89e+04
                    p2    9.5105e-01 +/-  1.04e-02
                    p3    1.0173e+01 +/-  1.62e-03
                    p4    1.1945e+00 +/-  2.68e-04
                    p5    8.7849e-02 +/-  5.18e-05

Both times I used the BatchMode. Since the fit parameter results are so similar, could quickFit have changed the way the residuals are computed?

Maybe it’s better to keep quickfit out of the equation when investigating this model? You said you have some code to do this fit standalone, maybe you can share that if there are some further checks I can help with?

About the ROOT version I use: yes, I use ROOT master that I built myself, but you can also source it from the nightly builds.

Cheers,
Jonas

Hi @jonas,

Right, with the latest ROOT master the BatchMode issue is gone. The relevant difference between the two quickFit repositories seems to be that in the atlas-phys-exotics-dijetisr one there is a default value set for _samplingRelTol so by default bins are integrated while in the standard quickFit they are not.

I can reproduce both repositories behaviour by setting --samplingRelTol 1e-3 (blue) or --samplingRelTol -1 (disabled, red):

Regarding keeping quickFit out of the equation, this is the small script I’ve been using to fit the workspace:

void minimizeNLL(){

  RooMsgService::instance().getStream(1).removeTopic(RooFit::NumIntegration) ;
  RooMsgService::instance().getStream(1).removeTopic(RooFit::Fitting) ;
  RooMsgService::instance().getStream(1).removeTopic(RooFit::Minimization) ;
  RooMsgService::instance().getStream(1).removeTopic(RooFit::InputArguments) ;
  RooMsgService::instance().getStream(1).removeTopic(RooFit::Eval) ;
  RooMsgService::instance().setGlobalKillBelow(RooFit::ERROR);

  TFile *tf = TFile::Open("floatingPars/dijetTLA_combWS_swift.root");

  RooWorkspace *w = (RooWorkspace *)tf->Get("combWS");
  RooStats::ModelConfig *mc = (RooStats::ModelConfig *)w->obj("ModelConfig");
  RooDataSet *data = (RooDataSet *)w->data("combData");
  RooArgSet *obs = (RooArgSet *)mc->GetObservables();
  RooDataHist* dh = new RooDataHist("dh", "dh", *obs, *data);

  RooAbsReal::defaultIntegratorConfig()->method1D().setLabel("RooAdaptiveGaussKronrodIntegrator1D");
  RooAbsPdf* pdf = mc->GetPdf();
  RooAbsReal* nll = pdf->createNLL(*data,
                                   RooFit::BatchMode(false),
                                   RooFit::Constrain(*mc->GetNuisanceParameters()),
                                   RooFit::IntegrateBins(-1),
                                   RooFit::GlobalObservables(*mc->GetGlobalObservables()));
  RooAbsReal* chi2 = pdf->createChi2(*dh, RooFit::DataError(0));

  double nll_init = nll->getVal();
  double chi2_init = chi2->getVal();

  w->var("nsig_mean1000_width7")->setConstant();
  
  RooMinimizer _minim(*nll);
  _minim.setStrategy(2);
  _minim.setPrintLevel(2);
  _minim.setEps(1e-10);
  _minim.setOffsetting(0);
  _minim.optimizeConst(2);
  _minim.setMinimizerType("Minuit2");
  _minim.minimize("Minuit2");

  double nll_final = nll->getVal();
  double chi2_final = chi2->getVal();

  std::cout << "initial NLL: " << std::setprecision(14) << nll_init << std::endl;
  std::cout << "final NLL: " << std::setprecision(14) << nll_final << std::endl;

  std::cout << "initial chi2: " << chi2_init << std::endl;
  std::cout << "final chi2: " << chi2_final << std::endl;

  TFile *tout = TFile::Open("floatingPars/dijetTLA_combWS_swift_afterFit.root","RECREATE");
  w->Write("combWS");
}

The result can be plotted with the ExtractPostfitFromWS.py as well. It seems to perfectly reproduce the red fit from above. Unfortunately I do not manage to get it succeed with IntegrateBins enabled.

Cheers,
Falk

Just a bump to keep this from closing.

I didn’t manage to perform further tests yet, but I suspect using the integration does not in general produce more accurate results and this might just be random chance. I did check however that this behaviour seems to be driven by the highest-statistics bins at the low end of the spectrum. Restricting the fit to e.g. 10 bins there yields even stronger deviations than in the full range.

Cheers,
Falk