Non-positive-definite covariance matrix and anti-correlation in 2D fit

liebestod · August 7, 2021, 8:42am

Hi all,

I’m doing a 2D extended maximum likelihood fit on D0 mass, with the joint distribution
M(K^+pi^-) vs M(K^-pi^+). I’m using double Gaussian as
signals and 3rd-order Chebychev polynomials as backgrounds for both dimensions.
Most events contain a D and Dbar, but there can be more than 2 D’s in an event.
I am fitting on every D-Dbar pairs.
In addition, my model includes what I call a swap component (as a Gaussian), coming from the
recombined D with its daughter K track misidentified as pi and pi track as K.

I’m mainly interested in the number of DDbar pairs (the signal D-signal Dbar component),
but the number of other components, the mean of the signal, and all the background shape parameters
are also determined by my fit.
On the other hand, the widths of the signal and the swap are predetermined by another fit to MC.
The number of the swap components is set to equal that of the signal, since there should be a misidentified
K pi pair for every correctly identified D0.

There are 4 (2x2) components in the 2D joint PDF without swaps, and 9 (3x3) components with swaps. The
number of signal DDbar pairs on both dimensions is labeled as Nss (number of signal-signal), so on and so forth.
In summary, I have at least 11 parameters in my fitter
(4 numbers of events in the extended PDF, 6 background shapes, 1signal mean).

A working example is here:

github.com

boundino/ddbarCorr/blob/2dfit/test/mccorr.cc

#include "RooAddPdf.h"
#include "RooFitResult.h"
#include "RooCategory.h"
#include "RooChebychev.h"
#include "RooConstVar.h"
#include "RooDataSet.h"
#include "RooGaussian.h"
#include "RooMCStudy.h"
#include "RooPlot.h"
#include "RooProdPdf.h"
#include "RooRandomizeParamMCSModule.h"
#include "RooRealVar.h"
#include "TCanvas.h"
#include "TDirectory.h"
#include "TH1.h"
#include "TString.h"
#include "TTree.h"
#include "TFile.h"
#include "TROOT.h"

This file has been truncated. show original

In the script, an ensemble of 200 toy MC samples by default are generated, and while most
finish without issues, some produce errors with RooFit.
The error messages roughly fall into 2 categories.

The covariance matrix is forced positive-definite:

 RooFitResult: minimized FCN value: -66389.3, estimated distance to minimum: 0.000133435
 covariance matrix quality: Full matrix, but forced positive-definite
 Status : MINIMIZE=-1 HESSE=4

This is usually accompanied with some variables highly correlated with others. For example,
a2 (background shape), number of background-background and number of signal-background
are highly anti-correlated.

  name          NO.  GLOBAL      1      2      3      4      5      6      7      8      9     10     11
  a1           1  0.51253   1.000 -0.255  0.320 -0.001  0.004 -0.001  0.011  0.246  0.072 -0.253 -0.087
  a2           2  0.97544  -0.255  1.000  0.337  0.003 -0.016 -0.004  0.020 -0.937 -0.273  0.964  0.332
  a3           3  0.56394   0.320  0.337  1.000  0.001 -0.007  0.005 -0.060 -0.344 -0.106  0.353  0.129
  b1           4  0.48608  -0.001  0.003  0.001  1.000 -0.118  0.471  0.009  0.008 -0.052  0.004 -0.003
  b2           5  0.40104   0.004 -0.016 -0.007 -0.118  1.000 -0.069  0.011 -0.040  0.260 -0.021  0.023
  b3           6  0.48887  -0.001 -0.004  0.005  0.471 -0.069  1.000 -0.090 -0.002  0.025 -0.005  0.008
  mean         7  0.16943   0.011  0.020 -0.060  0.009  0.011 -0.090  1.000 -0.031  0.061  0.032 -0.083
  nbb          8  0.97878   0.246 -0.937 -0.344  0.008 -0.040 -0.002 -0.031  1.000  0.079 -0.969 -0.195
  nbs          9  0.82844   0.072 -0.273 -0.106 -0.052  0.260  0.025  0.061  0.079  1.000 -0.176 -0.700
  nsb         10  0.98754  -0.253  0.964  0.353  0.004 -0.021 -0.005  0.032 -0.969 -0.176  1.000  0.191
  nss         11  0.82314  -0.087  0.332  0.129 -0.003  0.023  0.008 -0.083 -0.195 -0.700  0.191  1.000

The covariance matrix is accurate

covariance matrix quality: Full, accurate covariance
Status : MINIMIZE=-1 HESSE=4

And the global correlations seems normal.

My script includes a switch =fix_sig= to decide whether to also float the signal shapes parameters, increasing the
number of parameters to 17 (without swap) or 19 (with swap).
The switch =use_swap= determines whether to include the swap component.

Is there a better way to formulate the model? And are there other issues in my code that is causing problems?

I’m using ROOT v6-24-02 with c++ (GCC) 11.1.0 on Linux

liebestod · August 7, 2021, 8:44am

I posted this on CMS hypernews originally.
Andrew kindly provided some quick feedback, which I will reply here:

It looks like some parameters are hitting your predefined boundaries in the fits - you should try to avoid this if possible, it can give minuit problems converging and calculating an accurate covariance matrix

I increased the boundaries of my parameters, and sometimes this is forcing me to accept negative numbers. Is this fine?

There might be numerical instability issues - the FCN values are quite large (~60k), and perhaps the initial parameter ranges are not always well-matched to the order of magnitude of the fit uncertainties. You can try running a second fit after the first one, offsetting the NLL value to improve stability, e.g.:

result = model.fitTo(*sum, Extended(), Save(), Minimizer(“Minuit2”, “Migrad”));
result = model.fitTo(*sum, Extended(), Save(), Minimizer(“Minuit2”, “Migrad”), Offset(true));

I got a new error code from Minuit. This also seems to be related to negative diagonal element.

Error in <Minuit2>: VariableMetricBuilder Initial matrix not pos.def.
Warning in <Minuit2>: Minuit2Minimizer::Minimize Minimization did NOT converge, Covar is not pos def
Minuit2Minimizer : Invalid Minimum - status = 5
FVAL  = -10.6278
Edm   = -10.6278
Nfcn  = 683
Info in <Minuit2>: Minuit2Minimizer::Hesse Using max-calls 5500
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 0 ] = -1.17993e+15
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 1 ] = -1.85594e+15
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 2 ] = -2.11808e+28
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 3 ] = -2.44567e+14
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 4 ] = -3.85127e+14
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 5 ] = -4.39522e+27
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 6 ] = -4.39522e+27
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 7 ] = -2.50534e+13
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 8 ] = -1.08536e+13
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 9 ] = -1.05392e+13
Warning in <Minuit2>: MnPosDef non-positive diagonal element in covariance matrix[ 10 ] = -6.23868e+15
Warning in <Minuit2>: MnPosDef Added to diagonal of Error matrix a value 2.11808e+28
Warning in <Minuit2>: MnPosDef Matrix forced pos-def by adding to diagonal nan

Right now you are suppressing the minimization warnings, but I see a lot of errors like (*) when I put them back. Minuit usually seems to recover from these problematic regions, but again suggests the initial parameter values are ranges are not ideal

Thank you for pointing this out. Is there a way to decide the ideal parameter range?
I have a switch =cheat= in my code that uses the values for generating the toy samples as the initial values for fitting.
This probably excludes the initial value problem.

On top of that it does seem in some toys that you have flat directions in the fit. I see correlation coefficients between (nsb,sigfracx) of > 99% in some cases. You might want to analyse the full correlation matrix in each toy (when it is calculated accurately) and see if there is a pattern in which parameter pairs end up highly (anti-)correlated. You might then want to try and reformulate the model to remove the redundancy in the degrees of freedom

(*)
[#0] WARNING:Minization – RooMinimizerFcn: Minimized function has error status.
Returning maximum FCN so far (-54142.9) to force MIGRAD to back out of this region. Error log follows
Parameter values: a1=-0.200129, a2=0.0970513, a3=-0.0420635, b1=-0.436936, b2=-0.0162005, b3=-0.0925964, mean=1.86529, nbb=5585.52, nbs=202.088, nsb=245.344, nss=-47.611, sigfracx=0.0594494, sigfracy=0.220428, sigma1x=0.15, sigma1y=0.0324473, sigma2x=0.00645703, sigma2y=0.000109591
RooNLLVar::nll_model_sum[ paramSet=(a1,a2,a3,b1,b2,b3,mean,nbb,nbs,nsb,nss,sigfracx,sigfracy,sigma1x,sigma1y,sigma2x,sigma2y) ]
function value is NAN @ paramSet=(a1 = -0.200129 +/- 0.0261637,a2 = 0.0970513 +/- 0.134341,a3 = -0.0420635 +/- 0.0212804,b1 = -0.436936 +/- 0.0255798,b2 = -0.0162005 +/- 0.0240544,b3 = -0.0925964 +/- 0.0209629,mean = 1.86529 +/- 9.62291e-06,nbb = 5585.52 +/- 128.878,nbs = 202.088 +/- 37.9756,nsb = 245.344 +/- 99.9022,nss = -47.611 +/- 9.258,sigfracx = 0.0594494 +/- 0.138445,sigfracy = 0.220428 +/- 0.148026,sigma1x = 0.15 +/- 0.0170582,sigma1y = 0.0324473 +/- 0.0165096,sigma2x = 0.00645703 +/- 0.00116096,sigma2y = 0.000109591 +/- 8.02409e-06)

moneta · August 10, 2021, 3:38pm

Hi,
For the log file it seems the fit is problematic and certainly not reaching a good minimum, since you are having huge negative value in the Hessian.
This is probably duet to some problematic issues in the models definition. It could be numerical problem, like very unbalanced parameters, but most likely due to falling in some regions where your model is problematic to evaluate.
I would need a working example and try to reproduce this, but I will not have time to look into it before one week-10 days.

Cheers

Lorenzo

system · August 24, 2021, 3:38pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.