Dear root experts,
following this thread ( Fitter::FitFCN and no FitResult::GetConfidenceIntervals ), I tried to implement the calculation of confidence intervals for combined fits, taking into account a sub-sample of parameters.
The use case is the fit of multiple datasets with multiple functions sharing some parameters, as in the tutorial root.cern.ch/root/html/tutorials … Fit.C.html
For example:
- TH1 *hist1 must be fitted with TF1 *fit1 with parameters par2, par0, par3
- TH1 *hist2 must be fitted with TF1 *fit2 with parameters par0, par1, par2, par3, par4
- The fit is done with the struct GlobalChi2 of the tutorial above, which minimizes the sum of the chi squares of hist1-fit1 and hist2-fit2
- I want the confindence interval on fit1
Problems:
- ROOT::Fit::Fitter delete the model function if a custom objective function is passed to it, so ROOT::Fit::FitResult doesn’t know which model to use to compute the confidence interval
- Even if the model function wasn’t deleted, the confidence interval is computed using the vector of parameters of the global fit, which is different from the vector of parameters used by fit1 (the same for the covariance matrix)
Solution:
Basically, I derive the ROOT::Fit::FitResult class, so I can set the model function and replace the member variables that describes the parameters and the covariance matrix to into account only the parameters used by the passed model function.
Here is the code:
class MyFitResult : public ROOT::Fit::FitResult
{
public:
MyFitResult(const ROOT::Fit::FitResult &fitresult, TF1 *fit, UInt_t *idx = NULL) : ROOT::Fit::FitResult(fitresult)
{
SetFitFunction(new ROOT::Math::WrappedMultiTF1(*fit, 1));
UseParameters(idx);
}
void SetFitFunction(IModelFunction * func) { fFitFunc = func; }
void UseParameters(UInt_t *idx)
{
if (idx == NULL) return;
UInt_t npars = fFitFunc->NPar();
// replace the member vectors and maps with new ones that contain only the given parameters
// parameter indices as indices of the overall Fitter!
vector<double> params;
vector<double> errors;
vector<string> par_names;
vector< pair<double, double> > param_bounds;
vector<double> cov_matrix;
vector<double> global_cc;
map<unsigned int, bool> fixed_params;
map<unsigned int, unsigned int> bound_params;
map<unsigned int, pair<double, double> > minos_errors;
cov_matrix.reserve(npars*(npars + 1)/2);
for (UInt_t jpar = 0; jpar < npars; ++jpar)
{
UInt_t ipar = idx[jpar];
params.push_back(fParams[ipar]);
errors.push_back(fErrors[ipar]);
par_names.push_back(fParNames[ipar]);
global_cc.push_back(fGlobalCC[ipar]);
if (fFixedParams.find(ipar) != fFixedParams.end()) fixed_params.insert(pair<unsigned int, bool>(jpar, fFixedParams[ipar]));
if (fBoundParams.find(ipar) != fBoundParams.end())
{
bound_params.insert(pair<unsigned int, unsigned int>(jpar, param_bounds.size()));
UInt_t kpar = fBoundParams[ipar];
param_bounds.push_back(fParamBounds[kpar]);
}
if (fMinosErrors.find(ipar) != fMinosErrors.end()) minos_errors.insert(pair<unsigned int, pair<double, double>>(jpar, fMinosErrors[ipar]));
}
for (UInt_t ipar = 0; ipar < npars; ++ipar)
{
UInt_t kpar = idx[ipar];
for (UInt_t jpar = 0; jpar <= ipar; ++jpar)
{
UInt_t lpar = idx[jpar];
cov_matrix.push_back(CovMatrix(kpar, lpar));
}
}
fParams = params;
fErrors = errors;
fParNames = par_names;
fParamBounds = param_bounds;
fCovMatrix = cov_matrix;
fGlobalCC = global_cc;
fFixedParams = fixed_params;
fBoundParams = bound_params;
fMinosErrors = minos_errors;
}
};
To use it (given the variables used in the example before):
UInt_t idx[] = { 2, 0, 3 }; // each element is the index of the given parameter in the global fit; the indices must be in the same order as needed by the function fit1
MyFitResult result(fitter->Result(), fit, idx);
TH1 *h_confint = (TH1 *)hist1->Clone();
Double_t confint = 0.68;
Double_t *ci = new Double_t[h_confint->GetNbinsX()];
result.GetConfidenceIntervals(h_confint->GetNbinsX(), 1, 1, h_confint->GetXaxis()->GetXbins()->GetArray(), ci, confint);
I tested it with a few examples and it seems to be working: now there is no more the error “cannot compute Confidence Intervals without fitter function” and the confidence interval is using the correct paramaters vectors and covariance matrix.
Actually, the obtained confidence interval looks a bit smaller to me, so maybe there’s something that I overlooked: in particular, I’m not sure it is mathematically correct to use a submatrix of the covariance matrix and the chisquare used in the computation to rescale the interval is the one of the global fit, not the one relative to the hist1 and fit1 (also because the number of degree of freedom is not well defined for the single dataset).