Dependence of results of fit with scaling

Hello,

i occur a strange behaviour with roofit.
Let’s start from a given TH1 HistoInitial, transformed to RooDataset via import function, then make a fit with a given model.
It gives result1 for the parameters of the fit

Let’s now create a void TH2 HistoFinal, where we do something like HistoFinal.Add(HistoInitial, 1)
for a weight of 1, the fit of the HistoFinal gives the same results

But if one put a weight of 2 (for example), the results are a bit differents

Note that i reinitialized the parameters of the RooRealVar before doing each fit, in order to be in the same conditions.

I put a minimum example here :
/afs/cern.ch/user/e/escalier/public/MinimumProblem/

==>
root -b
.L MinimumProblem.C
MinimumProblem()
then just change the Add with 2 instead of 1 and you will see that the two fits (the initial one and the final one, don’t give the same results)

Any suggestion is welcome

Thank you so much for your help

Hello Marc,

When using RooFit, a fit to a binned data set uses by default a Poisson p.d.f for each bin. When you are scaling an histogram this is not correct anymore. For this reason you get same parameter values, but different errors.
In principle this should be corrected when using the option, ,RooFit::SumW2Error(1). In this case this option does not work (I don’t know if this is a bug or a limitation).

If you have scaled or weighted histograms, I would perform a chi2 square fit, which should work

Lorenzo

thanks Lorenzo,

unfortunately, the RooFit::SumW2Error(1) doesn’t help to fix the problem of the dependance of the fit with the Scaling of the histogram.

i tried also the chi2 fit, but it just get crazy fit.

i put a minimal code here :

/afs/cern.ch/user/e/escalier/public/MinimumProblem/SecondVersion

thank you

Hi Marc,

in RooFit the chi2 fit does not work when you have empty bins. Have you tried to use directly ROOT (TH1::Fit ? )

Lorenzo

thanks

yes, i tried the TH1F->Fit

but a crash happens :
/afs/cern.ch/user/e/escalier/public/MinimumProblem/ThirdVersion

(just to specify, i mean the fit with TH1 gives NAN )

Yes, I can reproduce the NAN. I think is a problem of the normalization, the TF1 you are getting from RooAbsPdf::asTF is normalized. When you are doing a chi2 fit the function does not need to be normalized to 1 and you need to add an extra parameter to adjust for the overall normalization

Lorenzo

HI Marc,

attached is a version of your macro working in the case of a chi2 fit which now works for me
Cheers

Lorenzo
MinimumProblem.C (8.66 KB)

thank you

indeed with your example using root, the nominal parameters are insensitive to the scaling
(actually, the fraction of CB is not exactly the same when the scaling is done)
But all the errors are changed, whereas they should have not changed, since Sumw2() was considered at the construction of the histogram.

without scaling :
1 A 3.21201e+02 9.82136e+00 2.17653e-04 -3.70204e-07
2 CB_mean 1.20785e+02 6.93835e-02 -1.65756e-07 1.16295e-02
3 CB_sigma 1.44911e+00 1.14428e-01 1.80052e-06 2.07100e-03
4 CB_alpha 1.08454e+00 2.85335e-01 2.21132e-05 -4.86224e-04
5 CB_n 6.75860e+00 7.15920e+00 -1.16405e-05 -8.63892e-04
6 Gauss_mean 1.21298e+02 4.88875e-01 4.14505e-07 1.75998e-02
7 Gauss_sigma 2.91310e+00 1.71139e-01 1.71103e-06 9.62195e-03
8 frac_CB 7.66279e-01 8.17307e-02 1.14887e-05 -2.02720e-03

with scaling :
1 A 3.21201e+06 9.83019e+04 -1.54906e+00 1.42861e-11
2 CB_mean 1.20785e+02 6.97652e-02 1.17860e-08 1.18575e-03
3 CB_sigma 1.44911e+00 1.13643e-01 -4.53855e-09 1.27339e-03
4 CB_alpha 1.08454e+00 2.97345e-01 2.08411e-05 -5.85303e-04
5 CB_n 6.75864e+00 7.44412e+00 -1.22035e-05 -7.65444e-04
6 Gauss_mean 1.21298e+02 5.05603e-01 -5.18738e-07 7.39696e-03
7 Gauss_sigma 2.91310e+00 1.67944e-01 1.07727e-06 6.04214e-03
8 frac_CB 7.66280e-01 8.19542e-02 -1.17118e-06 -1.02649e-03

==>
frac_CB is not the exactly the same
all the errors are different when the scaling is done, while the histograms have correctly be given Sumw2() at the construction

Hi Marc,
I think the differences can be due to numerical errors in the error calculations, given the fact that you have also applied a large scaling factor. Only the last parameters also changes because is more sensitive to it. It is the ratio of the two amplitudes.
Also, remember the estimate is biased (for example due to the fact you have bins without entries). This I guess is not independent of the scaling.
So, I am not surprised to see small differences like in your case.

Lorenzo

all right

thank you Lorenzo for all your help