HI @jonas ,
Thanks for the detailed reply. I gave the PiecewiseInterpolation method a try and extended your example to the correlated variable case according to your suggestion. Here’s some questions that I hope you could clarify:
- Since you have multiplied alpha vector by the correlation matrix (
g @ Lt
according to the notation of RooFitResult::randomizePars(), I suppose now that the result is no longer the number of standard deviation away from the mean of Gaussian. Therefore, the default value of boundary = 1.0 in PiecewiseInterpolation evaluation is no longer applicatble. I suppose it should be replaced by sqrt(covariance_matrix[i][i])
.
- If I look more carefully into the implementation of interpolation code 4 or 6, I believe this line is wrong, as
x
should be t
instead, i.e. mod = t * (S + t * A * (15 + t * t * (-10 + t * t * 3)))
- After these fixes, I tried to do the sampling of parameters, the results are making sense but not very good especially when the errors are very asymmetric. Below is a demo fit result that I have:
import ROOT
observable = ROOT.RooRealVar('myy', 'myy', 125, 105, 160)
observable.setBins(55)
c = ROOT.RooRealVar('c', 'c', 1, -10, 10)
base_pdf = ROOT.RooExponential('exp', 'exp', observable, c)
norm = ROOT.RooRealVar('N', 'N', 1, -float('inf'), float('inf'))
pdf = ROOT.RooAddPdf('exp_extended', '', ROOT.RooArgList(base_pdf), ROOT.RooArgList(norm))
observable.setRange('SB_Lo', 105, 120)
observable.setRange('SB_Hi', 130, 160)
bin_edges = np.linspace(105, 160, 55 + 1)
x = (bin_edges[:-1] + bin_edges[1:]) / 2
y = np.array([1201.58, 1179.01, 1140.39, 1128.87, 1111.56, 1124.37, 1098.3 ,
1052.13, 1015.04, 1006.18, 971.46, 945.95, 931.43, 913.02,
871.57, 0. , 0. , 0. , 0. , 0. , 0. ,
0. , 0. , 0. , 0. , 724.1 , 699.51, 687.49,
667.44, 657.39, 649.17, 634.78, 600.47, 586.51, 580.54,
570.01, 588.22, 555.9 , 540.72, 531.02, 546.06, 528.99,
521.36, 495.24, 483.4 , 448.63, 480.71, 454.2 , 450.19,
436.4 , 413.69, 412.14, 420.22, 417.97, 404.38])
arrays = {
'myy': x,
'weight': y
}
dataset = ROOT.RooDataSet.from_numpy(arrays, ROOT.RooArgSet(observable), weight_name='weight')
fit_result = pdf.fitTo(dataset, ROOT.RooFit.Range('SB_Lo,SB_Hi'), ROOT.RooFit.PrintLevel(-1),
ROOT.RooFit.Hesse(True), ROOT.RooFit.Save(), ROOT.RooFit.Strategy(2),
ROOT.RooFit.AsymptoticError(True))
This gives (I refitted once)
{'c': {'value': -0.020686713133077784,
'errors': (-0.009063431420349587, 0.0003276873353671196)},
'N': {'value': 39929.87921962675,
'errors': (-6416.289580898308, 223.7984340402964)}}
The distributions that I got after sampling 100,000 parameters are:
There’s also some extremely small (and negative) or extremely large values for N which is not quite satisfactory.
Do you have idea to improve on this? Or should we use MetropolisHastings?
Though I do think that it would be nice to have this implementation of randomizePars using piecewise interpolation.
Cheers,
Alkaid