Errors on template parameters

sarahwferg · April 27, 2019, 2:36am

Hello,

I am using two histogram Pdfs as templates (R and F) to fit data. To make sure my fit is working correctly, I am first fitting to a handmade distribution (D=0.8R + 0.2F) to make sure the fit is giving me back the answers I am expecting. When I run the fit I get the correct parameter values back of ~0.8 and ~0.2, but the errors I get are ~0.9 and ~0.6 respectively. I am trying to understand the source of this high error.

TH1F *h = new TH1F("h","h",nbins_pix,pix_min,9.5);
h->Add(npix_real_norm,npix_fake_norm,0.8,0.2);
TCanvas *c = new TCanvas("c","c",1);
h->Draw();
c->Print("PLOTS/h.pdf");

RooRealVar npix("npix","npix",pix_min,9.5);
	RooDataHist npix_data("npix_data","npix_data",npix,h);
	RooRealVar f_real_pix("f_real_pix","f_real_pix",0.8,0,1);
	RooRealVar f_fake_pix("f_fake_pix","f_fake_pix",0.2,0,1);
	RooDataHist pix_real_temp("pix_real_data","pix_real_data",npix,npix_real_norm);
	RooHistPdf pix_Pdf_real("pix_Pdf_real","pix_Pdf_real",npix,pix_real_temp,0);

        RooDataHist pix_fake_temp("pix_real_data","pix_real_data",npix,npix_fake_norm);
	RooHistPdf pix_Pdf_fake("pix_Pdf_fake","pix_Pdf_fake",npix,pix_fake_temp,0);
	RooAddPdf pix_Pdf("pix_Pdf","pix_Pdf",RooArgList(pix_Pdf_real,pix_Pdf_fake),RooArgList(f_real_pix,f_fake_pix));
	RooFitResult *pix_result = pix_Pdf.fitTo(npix_data,Range(0.5,6.5));
	RooPlot *pixframe = npix.frame();

	npix_data.plotOn(pixframe);
	pix_Pdf.plotOn(pixframe);
	pix_Pdf.plotOn(pixframe,Components(pix_Pdf_fake),LineColor(kRed));

	rpix[a-1]=f_real_pix.getVal();
	fpix[a-1]=f_fake_pix.getVal();
        rpix_err[a-1]=f_real_pix.getError();
        fpix_err[a-1]=f_fake_pix.getError();

cout << "real fraction: " << f_real_pix.getVal() << " +/- " << f_real_pix.getError() << endl;
cout << "fake fraction: " << f_fake_pix.getVal() << " +/- " << f_fake_pix.getError() << endl;

and I get the output:

 **********
 **    1 **SET PRINT           1
 **********
 **********
 **    2 **SET NOGRAD
 **********
 PARAMETER DEFINITIONS:
    NO.   NAME         VALUE      STEP SIZE      LIMITS
     1 f_fake_pix   2.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
     2 f_real_pix   8.00000e-01  1.00000e-01    0.00000e+00  1.00000e+00
 **********
 **    3 **SET ERR         0.5
 **********
 **********
 **    4 **SET PRINT           1
 **********
 **********
 **    5 **SET STR           1
 **********
 NOW USING STRATEGY  1: TRY TO BALANCE SPEED AGAINST RELIABILITY
 **********
 **    6 **MIGRAD        1000           1
 **********
 FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
 START MIGRAD MINIMIZATION.  STRATEGY  1.  CONVERGENCE WHEN EDM .LT. 1.00e-03
 FCN=2.32656 FROM MIGRAD    STATUS=INITIATE        6 CALLS           7 TOTAL
                     EDM= unknown      STRATEGY= 1      NO ERROR MATRIX       
  EXT PARAMETER               CURRENT GUESS       STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  f_fake_pix   2.00000e-01   1.00000e-01   2.57889e-01  -2.05635e-04
   2  f_real_pix   8.00000e-01   1.00000e-01   2.57889e-01  -2.04976e-05
                               ERR DEF= 0.5
 MIGRAD MINIMIZATION HAS CONVERGED.
 MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=2.32656 FROM MIGRAD    STATUS=CONVERGED      22 CALLS          23 TOTAL
                     EDM=2.0068e-09    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                   STEP         FIRST   
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE 
   1  f_fake_pix   2.00134e-01   5.79522e-01   1.52813e-03  -5.99820e-06
   2  f_real_pix   8.00042e-01   8.97529e-01   2.69453e-03   1.81412e-05
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  2    ERR DEF=0.5
  2.857e-01 -8.551e-02 
 -8.551e-02  8.878e-01 
 PARAMETER  CORRELATION COEFFICIENTS  
       NO.  GLOBAL      1      2
        1  0.16981   1.000 -0.170
        2  0.16981  -0.170  1.000
 **********
 **    7 **SET ERR         0.5
 **********
 **********
 **    8 **SET PRINT           1
 **********
 **********
 **    9 **HESSE        1000
 **********
 COVARIANCE MATRIX CALCULATED SUCCESSFULLY
 FCN=2.32656 FROM HESSE     STATUS=OK             10 CALLS          33 TOTAL
                     EDM=2.0128e-09    STRATEGY= 1      ERROR MATRIX ACCURATE 
  EXT PARAMETER                                INTERNAL      INTERNAL  
  NO.   NAME      VALUE            ERROR       STEP SIZE       VALUE   
   1  f_fake_pix   2.00134e-01   5.79513e-01   3.05627e-04  -6.43166e-01
   2  f_real_pix   8.00042e-01   8.97524e-01   5.38906e-04   6.43605e-01
                               ERR DEF= 0.5
 EXTERNAL ERROR MATRIX.    NDIM=  25    NPAR=  2    ERR DEF=0.5
  2.857e-01 -8.572e-02 
 -8.572e-02  8.879e-01 
 PARAMETER  CORRELATION COEFFICIENTS  
       NO.  GLOBAL      1      2
        1  0.17018   1.000 -0.170
        2  0.17018  -0.170  1.000
[#1] INFO:Minization -- RooMinimizer::optimizeConst: deactivating const optimization
[#1] INFO:InputArguments -- RooAbsData::plotOn(npix_data) INFO: dataset has non-integer weights, auto-selecting SumW2 errors instead of Poisson errors
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) p.d.f was fitted in range and no explicit plot,norm range was specified, using fit range as default
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) only plotting range 'fit_nll_pix_Pdf_npix_data'
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) p.d.f. curve is normalized using explicit choice of ranges 'fit_nll_pix_Pdf_npix_data'
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) p.d.f was fitted in range and no explicit plot,norm range was specified, using fit range as default
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) only plotting range 'fit_nll_pix_Pdf_npix_data'
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) p.d.f. curve is normalized using explicit choice of ranges 'fit_nll_pix_Pdf_npix_data'
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) directly selected PDF components: (pix_Pdf_fake)
[#1] INFO:Plotting -- RooAbsPdf::plotOn(pix_Pdf) indirectly selected PDF components: ()
real fraction: 0.800042 +/- 0.897524
fake fraction: 0.200134 +/- 0.579513

Honestly, I do not understand much about how migrad calculates errors. I have used RooFit before to fit a gaussian distribution and a polynomial distribution, but I had reasonable errors on both of those instances.

I have attached the full code and a plot of the fit. I thought maybe the zero bins in one of the templates might be causing it, so I set the range not to include those bins, but I still got errors about the same.

I’d appreciate any help on understanding the source of these errors. The templates are normalized before I use them in RooFit - I am not sure if that has anything to do with it.

Thanks!
Sarah

use_RooFit.C (17.0 KB)

pixPdf_ratio_full.pdf (15.7 KB)

StephanH · April 29, 2019, 6:57am

Hi Sarah,

at first sight, it looks reasonable. Only the uncertainties of the data look a bit strange to me. Have you tried to not normalise the data before the fit? Maybe the scaling of errors went wrong while normalising the data.

system · May 13, 2019, 6:57am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.