How to improve the fit error(using RooFit/ROOT)

hurricane1127 · November 24, 2020, 6:36am

the Pdf is user defined functions;
the Data is defined by model.generate();

my code is

#ifndef __CINT__
#include "RooGlobalFunc.h"
#endif
#include "RooRealVar.h"
#include "RooDataSet.h"
#include "RooGaussian.h"
#include "RooChebychev.h"
#include "RooAddPdf.h"
#include "TCanvas.h"
#include "TAxis.h"
#include "RooPlot.h"
#include <vector>
#include "RooGenericPdf.h"

using namespace std;
using namespace RooFit ;


void simple(){

  RooRealVar x("x","x",0,2.2794) ;
  RooRealVar kappa("kappa","kappa",2.2794,2.27,2.29);
  RooGenericPdf model("model","model","0.665/(8*TMath::Pi()/3)/(kappa*kappa*pow((1+x),3))*(kappa*(1+pow((1+x),2))-4*x/kappa*(1+x)*(kappa-x))",RooArgSet(x,kappa));
	
  RooDataSet* data = model.generate(x,1E6);
  model.fitTo(*data);

  RooPlot* xframe = x.frame();
  data->plotOn(xframe);
  model.plotOn(xframe,LineStyle(kDashed));

  TCanvas *c = new TCanvas("c","c",800,1000);
  xframe->GetXaxis()->SetTitleOffset(1.5);
  xframe->Draw();
		
}

And the fit result is here:

Processing simple.cxx…

RooFit v3.60 – Developed by Wouter Verkerke and David Kirkby
Copyright © 2000-2013 NIKHEF, University of California & Stanford University
All rights reserved, please read http://roofit.sourceforge.net/license.txt

[#1] INFO:NumericIntegration – RooRealIntegral::init(model_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)
[#1] INFO:NumericIntegration – RooRealIntegral::init(model_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)
[#1] INFO:NumericIntegration – RooRealIntegral::init(model_Int[x]) using numeric integrator RooIntegrator1D to calculate Int(x)
[#1] INFO:Minization – RooMinimizer::optimizeConst: activating const optimization

** 1 **SET PRINT 1

** 2 **SET NOGRAD

PARAMETER DEFINITIONS:
NO. NAME VALUE STEP SIZE LIMITS
1 kappa 2.27940e+00 2.00000e-03 2.27000e+00 2.29000e+00

** 3 **SET ERR 0.5

** 4 **SET PRINT 1

** 5 **SET STR 1

NOW USING STRATEGY 1: TRY TO BALANCE SPEED AGAINST RELIABILITY

** 6 **MIGRAD 500 1

FIRST CALL TO USER FUNCTION AT NEW START POINT, WITH IFLAG=4.
START MIGRAD MINIMIZATION. STRATEGY 1. CONVERGENCE WHEN EDM .LT. 1.00e-03
FCN=641583 FROM MIGRAD STATUS=INITIATE 4 CALLS 5 TOTAL
EDM= unknown STRATEGY= 1 NO ERROR MATRIX
EXT PARAMETER CURRENT GUESS STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 kappa 2.27940e+00 2.00000e-03 2.01742e-01 -4.95878e-01
ERR DEF= 0.5
MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=641583 FROM MIGRAD STATUS=CONVERGED 14 CALLS 15 TOTAL
EDM=4.99876e-06 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 kappa 2.28091e+00 5.23373e-03 3.05704e-01 4.04022e-03
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 1 ERR DEF=0.5
3.037e-05

** 7 **SET ERR 0.5

** 8 **SET PRINT 1

** 9 **HESSE 500

COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=641583 FROM HESSE STATUS=OK 5 CALLS 20 TOTAL
EDM=4.90868e-06 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER INTERNAL INTERNAL
NO. NAME VALUE ERROR STEP SIZE VALUE
1 kappa 2.28091e+00 5.16186e-03 1.22282e-02 9.14439e-02
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 1 ERR DEF=0.5
2.945e-05

The fit error is at the level of 1e-3(now the entries is 1E6);
Is it possible to improve the level of error(such as:1e-4) when the entries is still as 1E6？
That is, how I improve my fit error without changing the entries？

oshadura · November 24, 2020, 8:55am

I will try to ping @moneta or maybe @StephanH will have time to reply…Thank you in advance!

moneta · November 24, 2020, 9:21am

Hi,
This is correct, if you have around 1E6 entries you expect the statistical fit error to be around sort(N) , i.e. 10-3.
And no, if the estimator is efficient, as is in the case when using a maximum likelihood, you cannot improve it further without increasing your sample statistics (number of entries).
See the Cramer-Rao bound for minimum variance,

that is obtained when using the maximum likelihood estimator with large statistics.

Best regards

Lorenzo

hurricane1127 · November 24, 2020, 12:12pm

Thanks a lot for your reply.
I see.
I have another question: the error in my code is statistical error, which only related to the entries(Here is 1E6)?

But can I improve the error by changing the algorithm (Hesse/MIGRAD/…) or by setting the RooMinimizer?

Best regards
SH chen

hurricane1127 · November 24, 2020, 12:14pm

Thanks a lot!

moneta · November 24, 2020, 12:34pm

Hi
Uisng a different Minimizer algorithm can be helpful in finding the right minimum, especially when the fit does not converge. Once you find a minimum you can use Hesse to estimate the error (using second derivatives of log-likelihood function) or Minos. Minos determines an interval, which is more accurate in case of asymmetric errors (non-parabolic log-likelihood functions) and it should be used in this case.
Now with 1E6 entries you have enough statistics to use asymptotic approximation and the log-likelihood is a parabola and then Hesse and MInos will return the same result.

Lorenzo

hurricane1127 · November 24, 2020, 12:39pm

OK， I see.
Thank you again for your great help!

Best regards,
SH Chen

hurricane1127 · November 25, 2020, 7:02am

Let me add that I am puzzled with the ROOT::Math::Minimizer:
https://ms2.physik.hu-berlin.de/~kind/atlas/aplusplus/htmldoc/ROOT__Math__Minimizer.html
Is it possible improve the fit result(parameter error) by control the minimization ,such as:

ROOT::Math::MinimizerOptions::SetDefaultTolerance(1.E-6)；

Or SetPrecision()
I dont understand how the Tmath::MinimizerOptions work.

Best regards,
SH chen

moneta · November 25, 2020, 12:19pm

Hi,

You can control the convergence of the minimisation iteration using, ROOT::Math::MinimizerOptions::SetDefaultTolerance, but it is not need to provide such a small value. .Something like 10**-2 or 3 is good enough, since the actual tolerance value used internally is a factor 1000 smaller.

SetPrecision should not be used in general. This controls the precision of the evaluation of the minimisation function. The default value (double precision) should be used. In case of precision problems one should focus on try to improve the function computation

Best regards

Lorenzo

hurricane1127 · November 25, 2020, 1:38pm

Hi,
Thanks for your reply again!
I will focus on my our function and parameters.

Best wishes,
SH chen

system · December 9, 2020, 1:41pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.