Extended fit with custom parameter in Roofit

rafal · December 19, 2022, 9:49am

Dear Experts,

I am trying to understand if it is possible it Roofit to have an unbinned extended likelihood where the number of events is not directly an independent variable, but some well-defined function of an independent variable and other parameters of the fit. For example, I would like to fit Gaussian-distributed data with an Aexp(-Bx^2) function and obtain from the fit parameters A, B, and the correlation between A and B. All the examples I was able to find had the number of events as the parameter.

Many thanks in advance.

Kind regards,
Rafał Staszewski

jonas · December 19, 2022, 3:31pm

Hi @rafal, thanks for your question!

I guess you are already familiar with the rf202_extendedmlfit example. It suggests to do extended fits by using a RooAddPdf with the number of events for each coefficient.

What you suggest with fitting A*exp(-B*x^2) is not directly possible in RooFit. RooFit automatically normalizes every PDF to unity. Therefore, your scaling parameter A would have no effect because it’s just a constant factor that get scaled away. For extended fits, RooFit still normalized the PDF, and adds an extra Poisson term with the observed and predicted total number of events.

But what keeps you from just doing the fit with the number of events as parameter, and then transform your parameters after the fit? You can do this even inside RooFit and propagate the errors with RooAbsReal::getPropagatedError() and doing some covariance calculation yourself.

For your A*exp(-B*x^2) example, I wold do it like:

/// Numerically compute linear correlation of f with x
double calcCorr(RooAbsReal const& f,
                RooRealVar& x,
                double ferr)
{
    double val = x.getVal(); // central value of x

    // compute variations of f
    x.setVal(val - x.getError());
    double varDown = f.getVal();
    x.setVal(val + x.getError());
    double varUp = f.getVal();

    x.setVal(val); // reset central value

    return std::abs(varUp - varDown) / ferr / 2.;
}


using namespace RooFit;

// observables
RooRealVar x("x", "x", -10, 10);

// model parameters
RooRealVar b("b", "b", 0.1, 0.01, 100.);
RooRealVar n("n", "n", 10000, 1000, 100000);

// actual model and extended PDF
RooGenericPdf model("model", "exp(-b*x*x)", {x, b});
RooAddPdf pdf("pdf", "pdf", model, n);

// fit a toy dataset
std::unique_ptr<RooDataSet> data{pdf.generate(x)};
std::unique_ptr<RooFitResult> res{pdf.fitTo(
        *data, Save(), PrintLevel(-1)
)};
res->Print();

// define coefficient "A" as the normalization integral for
// `exp(-B*x^2)` times the number of events
std::unique_ptr<RooAbsReal> modelNorm{model.createIntegral(x)};
modelNorm->SetName("model_norm");
RooFormulaVar a{"a", "n * model_norm", {n, *modelNorm}};

// get value and propagated error for "A"
double aVal = a.getVal();
double aErr = a.getPropagatedError(*res);

// compute correlations with fit parameter by variation study
double corrN = calcCorr(a, n, aErr);
double corrB = calcCorr(a, b, aErr);

// summary
std::cout << "Value and error    : " << aVal << " +/- " << aErr
        << "\nRelative error     : " << (aErr/aVal)
        << "\nCorrelation with n : " << corrN
        << "\nCorrelation with a : " << corrB << std::endl;

Here is the output of the code:

    Floating Parameter    FinalValue +/-  Error
  --------------------  --------------------------
                     b    9.9083e-02 +/-  1.40e-03
                     n    1.0000e+04 +/-  1.00e+02

[#1] INFO:NumericIntegration -- RooRealIntegral::init(model_norm) using numeric integrator RooIntegrator1D to calculate Int(x)
[#1] INFO:NumericIntegration -- RooRealIntegral::init(model_norm) using numeric integrator RooIntegrator1D to calculate Int(x)
Value and error    : 56308.2 +/- 689.832
Relative error     : 0.012251
Correlation with n : 0.816258
Correlation with a : 0.577686

Note that the relative uncertainty on A is larger than the relative uncertainty on n. This is expected, because the parameter B enters the normalization, which is part of A and therefore the uncertainty in A is a bit larger than for the relative uncertainty for the number of events (which is just sqrt(n)/n, 0.01).

With this idea of defining new functions as RooFormulaVars and doing the uncertainty propagation, you can probably achieve what you want.

Cheers,
Jonas

PS: I did it just for the example, but in practice you should avoid the RooGenericPdf because the integration will be not efficient. As you see in the text output, the integration is done numerically. It’s better to define your model in terms of objects where RooFit knows the analytic integral, like RooGaussian.

rafal · December 20, 2022, 9:12am

Hi @jonas,

Many thanks for your very comprehensive answer, including both the solution and explanations. This is very helpful and I am really grateful for your time spent on answering.

Cheers,
Rafal

system · January 3, 2023, 9:13am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.