RooFit error in parameter table different from result

sciencecw · October 17, 2018, 3:11pm

I am trying to fit a crystal ball function to Z peak. The fit finishes successfully and fit result printout (RooFitResult::Print()) has reasonable error values. However, if you look at the table for the last fitting step (HESSE), error is much larger than expected. At the same time, the parameters themselves have the same values as the results.

** 11 **HESSE 2000

COVARIANCE MATRIX CALCULATED SUCCESSFULLY
FCN=8.72292e-05 FROM HESSE STATUS=OK 31 CALLS 232 TOTAL
EDM=1.34832e-08 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER INTERNAL INTERNAL
NO. NAME VALUE ERROR STEP SIZE VALUE
1 cb_alpha -1.23559e+00 1.13083e+01 7.08829e-04 -1.23876e-01
2 cb_mass 9.07937e+01 1.52140e+01 1.94097e-03 7.94558e-02
3 cb_power 1.30459e+00 8.42296e+00 3.54643e-03 -8.31706e-01
4 cb_sigma 2.69402e+00 9.69808e+01 1.18205e-03 -1.24710e+00
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 4 ERR DEF=0.5
4.741e+05 -2.869e+05 5.965e+05 -3.063e+05
-2.869e+05 6.945e+05 -3.088e+05 3.332e+05
5.965e+05 -3.088e+05 9.262e+05 -2.936e+05
-3.063e+05 3.332e+05 -2.936e+05 7.386e+05
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.92946 1.000 -0.500 0.900 -0.518
2 0.56102 -0.500 1.000 -0.385 0.465
3 0.91046 0.900 -0.385 1.000 -0.355
4 0.60990 -0.518 0.465 -0.355 1.000
[#1] INFO:Minization – RooMinimizer::optimizeConst: deactivating const optimization`

RooFitResult: minimized FCN value: 8.72292e-05, estimated distance to minimum: 1.34832e-08
covariance matrix quality: Unknown, matrix was externally provided
Status : MINIMIZE=0 HESSE=0 HESSE=0

Floating Parameter    FinalValue +/-  Error

          cb_alpha   -1.2356e+00 +/-  2.21e-02
           cb_mass    9.0794e+01 +/-  2.53e-02
          cb_power    1.3046e+00 +/-  3.11e-02
          cb_sigma    2.6940e+00 +/-  2.60e-02

couet · October 18, 2018, 7:20am

I think @moneta can help you with this question.

StephanH · October 18, 2018, 8:41am

Hello @sciencecw,

HESSE will yield the same parameters, because it only estimates errors.
I find this suspicious, though:

PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2 3 4
1 0.92946 1.000 -0.500 0.900 -0.518
2 0.56102 -0.500 1.000 -0.385 0.465
3 0.91046 0.900 -0.385 1.000 -0.355
4 0.60990 -0.518 0.465 -0.355 1.000

You have 90% correlation between parameter 1 and 3, 50% correlation between the others. Strong correlations between the parameters mean that the minimum is a line or a hyperplane in your parameter space, which makes it hard for the fitter to find a proper minimum.
If you can reparametrise the problem such that the correlation between 1 and 3 goes away (they seem to be almost the same parameter, anyway), you might get reasonable results.

sciencecw · October 21, 2018, 1:31pm

I am aware of the correlation there. I am using the RooCBShape class. How does one go about fitting a crystal ball function then? Is it a problem only because of the shape of my histogram? It doesn’t fit very well when I fix alpha for example.

StephanH · October 22, 2018, 4:24pm

Would you post the the part of the code where you define the RealVars and the crystal ball function?

sciencecw · October 22, 2018, 5:24pm

This is done in class methods in pyroot code. I have rearranged them here so you get the gist of it…

self.datahist = ROOT.RooDataHist( '%s%s' %(self.label,name), 'data', 
                            ROOT.RooArgList(self.xvardata), hist )
ROOT.SetOwnership( self.datahist, False )
valsar =  [("cb_mass"  ,"Mass"  ,90,80,100),
                          ("cb_sigma" ,"Sigma" ,1,0.1,100),
                           ("cb_alpha" ,"Alpha" ,-1,-10,10),
                          ("cb_power" ,"Power" ,2,0,10),]
cbvals = []
for v in valsar:
            name = v[0] 
            vtmp = ROOT.RooRealVar(*v)
            self.defs[name] = vtmp 
            cbvals.append(vtmp)    
self.defs["cb_mass"].setError( 1. )
self.defs["cb_sigma"].setError( 0.5 )
self.defs["cb_alpha"].setError( 0.05 )
self.defs["cb_power"].setError( 1. )
cb =  RooCBShape('cb_%s' %self.label, 'A  Crystal Ball Lineshape', self.xvardata,*cbvals)
self.fitresult = cb.fitTo( self.datahist,
                     ROOT.RooFit.Range(*self.fitrange),
                     ROOT.RooFit.SumW2Error(True), ROOT.RooCmdArg( 'Strategy', 3 ) ,
                     ROOT.RooFit.Save(ROOT.kTRUE))

for what the plots look like:
http://hepcms-hn.umd.edu/~kakw/2018_09_28_FitManager_Rework.html

StephanH · October 22, 2018, 7:36pm

Alright, it looks like alpha and power are highly correlated. What about doing the following: Reparametrise power, because it is essentially
power = c * alpha,
where c is a nearly constant thing (hence the correlation).

Therefore, define a variable c for this ‘constant’, and use as argument for power
c * alpha,
the fitter might have an easier job now to measure c and its error instead of having to measure alpha and power simultaneously.

It might as well go in the other direction, namely the problem becomes even more degenerate. In this case, you could try the inverse, replacing alpha by c*power or also power/c.

NB 1

The CB formula in your notebook is incomplete. It is missing ^-n for the exponential part.

NB 2

Setting the error of the fit parameters has no effect. You have to constrain them if you want them to be constrained in the fit. The two methods to achieve this are shown in this tutorial, which you can also find in your $ROOTSYS:
https://root.cern.ch/root/html/tutorials/roofit/rf604_constraints.C.html

This could also solve the degeneracy, because you set an allowed range for the parameters.

sciencecw · October 29, 2018, 4:05pm

Is there a way to do this without rewriting the function? I am thinking about something like this:
wk.factory("DoubleCB::pdf(x[70,195],dcb_mass[90,80,100],dcb_sigma[1,0.1,100]," "dcb_alpha1[1,0,10],dcb_alpha1[1,0,10]/dcb_c1[1,0,10],dcb_alpha2[1,0,10],dcb_alpha2[1,0,10]/dcb_c2[1,0,10])")

For the error, I think it changes the initial step size, though I agree it doesn’t seem to change the result.

StephanH · October 29, 2018, 9:44pm

Yes, that looks like what I had in mind. Did you try it?

sciencecw · October 30, 2018, 5:27pm

I think it ignored the part after division sign. Both evaluate to the same value.

RooWorkspace(doublecb) doublec loaded = ROOT.gROOT.ProcessLineSync(".x DoubleCB.cxx+")

(DoubleCB) Name: Title:

Info in <TUnixSystem::ACLiC>: creating shared library /home/kakw/efake/WG_Analysis/Plotting/./DoubleCB_cxx.so

In [16]:

wk = ROOT.RooWorkspace(“doublecb”) wk.factory(“DoubleCB::pdf(x[70,195],dcb_mass[90,80,100],dcb_sigma[1,0.1,100],” “dcb_alpha1[1,0,10],dcb_power1[1,0,10],dcb_alpha2[1,0,10],dcb_power2[1,0,10])”)

Out[16]:

<ROOT.DoubleCB object (“pdf”) at 0x93ac1e0>

In [17]:

func_pdf = wk.pdf(“pdf”)

In [18]:

wk.factory(“DoubleCB::pdfnew(x[70,195],dcb_mass[90,80,100],dcb_sigma[1,0.1,100],” “dcb_alpha1[1,0,10],dcb_power1[1,0,10]/dcb_alpha1[2,0,10],dcb_alpha2[1,0,10],dcb_power2[1,0,10]/dcb_alpha1[2,0,10])”)

Out[18]:

<ROOT.DoubleCB object (“pdfnew”) at 0x96ef5b0>

[#1] INFO:ObjectHandling – RooWorkSpace::import(doublecb) Recycling existing object x created with identical factory specification [#1] INFO:ObjectHandling – RooWorkSpace::import(doublecb) Recycling existing object dcb_mass created with identical factory specification [#1] INFO:ObjectHandling – RooWorkSpace::import(doublecb) Recycling existing object dcb_sigma created with identical factory specification [#1] INFO:ObjectHandling – RooWorkSpace::import(doublecb) Recycling existing object dcb_alpha1 created with identical factory specification [#1] INFO:ObjectHandling – RooWorkSpace::import(doublecb) Recycling existing object dcb_alpha2 created with identical factory specification [#1] INFO:ObjectHandling – RooWorkSpace::import(doublecb) Recycling existing object dcb_power2 created with identical factory specification

In [20]:

wk.Print()

RooWorkspace(doublecb) doublecb contents variables --------- (dcb_alpha1,dcb_alpha2,dcb_mass,dcb_power1,dcb_power2,dcb_sigma,x) p.d.f.s -------
DoubleCB::pdf[ m=x m0=dcb_mass sigma=dcb_sigma alpha1=dcb_alpha1 n1=dcb_power1 alpha2=dcb_alpha2 n2=dcb_power2 ] = 0.0142713
DoubleCB::pdfnew[ m=x m0=dcb_mass sigma=dcb_sigma alpha1=dcb_alpha1 n1=dcb_power1 alpha2=dcb_alpha2 n2=dcb_power2 ] = 0.0142713

In [21]:

func_pdfnew = wk.pdf(“pdfnew”)

In [24]:

func_pdfnew.getVal()

Out[24]:

0.014271309640297257

In [25]:

func_pdf.getVal()

Out[25]:

0.014271309640297257

StephanH · November 5, 2018, 10:32am

Indeed it looks like the factory does not properly parse this expression. So what about:

wk.factory("expr::scaled_power1('dcb_power1/dcb_alpha1', dcb_power1[1,0,10], dcb_alpha1[2,0,10])")
wk.factory("expr::scaled_power2('dcb_power2/dcb_alpha2', dcb_power2[1,0,10], dcb_alpha2[2,0,10])")
wk.factory("DoubleCB::pdfnew(x[70,195],dcb_mass[90,80,100],dcb_sigma[1,0.1,100],dcb_alpha1[1,0,10],scaled_power1, dcb_alpha2[1,0,10], scaled_power2)")

It works at least with the first line:

root [1] w.factory("expr::scaled_power1('dcb_power1/dcb_alpha1', dcb_power1[1,0,10], dcb_alpha1[2,0,10])")
(RooAbsArg *) 0x2d8afd0
root [2] w.Print()

RooWorkspace()  contents

variables
---------
(dcb_alpha1,dcb_power1)

functions
--------
RooFormulaVar::scaled_power1[ actualVars=(dcb_power1,dcb_alpha1) formula="dcb_power1/dcb_alpha1" ] = 0.5

sciencecw · November 7, 2018, 6:20pm

Thanks. I haven’t tried it. But it looks like this form will work.

On the other hand, in de-correlating the parameters I am getting a mixed result. It is now less likely that the fit fails but the correlation remains close to -1. What is weird is that sometimes it flips from close to -1 to close to 1. I wonder what this suggests.

StephanH · November 8, 2018, 11:07am

If you decorrelate the parameters, shoudn’t the correlation vanish/reduce?

system · November 22, 2018, 11:07am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.