Simultaneous fit and upper limit with constraints

Hello @sebmnt, sorry for the very late reply!

Here is one general thing to know for RooFit: the RooSimultaneous only works to create likelihoods when it’s the top-level RooFit argument. If you call fitTo() or createNLL() on a pdf that contains a RooSimultaneous but is not a RooSimultaneous itself, you will get garbage.

That means multiply in the full model (which is a RooSimultanous) with the constraints is out of the picture.

So now you’re asking which pdf to multiply with the constraint pdf? The answer is: it doesn’t matter, even if it’s duplicate. It only needs to be part of a RooProdPdf. When the likelihood object is constructed, RooFit goes though the full model and finds all the pdfs that are factors of RooProdPdfs that don’t depend on observables. These are extracted from the model, de-duplicated by name, and used to construct the RooConstraintSum that is added to the main likelihood. You can get a feeling for this with this example:

void demo()
{
    using namespace RooFit;

    RooWorkspace ws{"ws"};

    // only the n_bkg_total parameter is shared
    ws.factory("n_bkg_total[2000, 0, 100000]");

    ws.factory("Gaussian::sig_1(x_1[0, 10], mu_1[7, 0, 10], sigma_1[1, 0.1, 10])");
    ws.factory("Uniform::bkg_1(x_1)");
    ws.factory("expr::n_bkg_1('0.4 * n_bkg_total', n_bkg_total)");
    ws.factory("SUM::model_1(n_sig_1[100, 0, 100000] * sig_1, n_bkg_1 * bkg_1)");

    ws.factory("Gaussian::sig_2(x_2[0, 10], mu_2[4, 0, 10], sigma_2[2, 0.1, 10])");
    ws.factory("Uniform::bkg_2(x_2)");
    ws.factory("expr::n_bkg_2('0.6 * n_bkg_total', n_bkg_total)");
    ws.factory("SUM::model_2(n_sig_2[100, 0, 100000] * sig_2, n_bkg_2 * bkg_2)");

    // create the constraint
    ws.factory("Gaussian::n_bkg_total_constr(n_bkg_total_glob[2000], n_bkg_total, 1000)");

    // add the constraints anywhere (you can try adding them also in both channels)
    ws.factory("PROD::model_1_constr(model_1, n_bkg_total_constr)");
    ws.factory("PROD::model_2_constr(model_2)");

    // create the simultaneous pdf
    ws.factory("SIMUL::sim_pdf(channel_cat[A=0,B=1], A=model_1_constr, B=model_2_constr)");

    RooAbsPdf& pdf = *ws.pdf("sim_pdf");
    RooAbsArg& channelCat = *ws.arg("channel_cat");
    RooArgSet observables{*ws.var("x_1"), *ws.var("x_2"), channelCat};

    // create toy data
    std::unique_ptr<RooDataSet> data{pdf.generate(observables)};

    // create nll (don't forget the global observables such that the constraint pdfs get normalized correctly!)
    RooArgSet globs{*ws.var("n_bkg_total_glob")};
    std::unique_ptr<RooAbsReal> nllPtr{pdf.createNLL(*data, GlobalObservables(globs))};

    // inspect the nll (which is actualy a RooAddition). you don't need the cast in PyROOT
    auto& nll = static_cast<RooAddition&>(*nllPtr);

    // if there were constraints, the constraint sum is the last element in the addition
    auto& constraintSum = static_cast<RooConstraintSum&>(nll.list()[1]);

    // print the constraints
    constraintSum.Print();
}

Here is the output of the demo:

[#1] INFO:Minimization -- p.d.f. provides expected number of events, including extended term in likelihood.
[#1] INFO:Minimization --  Including the following constraint terms in minimization: (n_bkg_total_constr)
[#1] INFO:Minimization -- The following global observables have been defined and their values are taken from the model: (n_bkg_total_glob)
RooConstraintSum::nll_sim_pdf_hmaster_constr[ set1=(n_bkg_total_constr) ] = 7.82669

Therefore, your implementation was correct.

Concerting your second question with the upper limit calculation: that depends on the kind of analysis that you are doing. For sure the n_bkg_total belongs to the nuisance parameters. But if the second signal parameter should be set to constant or is a nuisance parameter depends a bit on what you want to measure. That’s something I can’t tell you, maybe ask your analysis team?

I hope this helps, let me know if you have further questions!

Cheers,
Jonas

2 Likes