Extended pdf in RooStat

Dear experts,

Hello, I ask a small thing related to RooStat. I am new at RooStat. Therefore, I recently started to search materials to learn it. In RooStat tutorial twiki page. I found

myWS.factory("x[0,1]"); // arbitrary range [0,1] 
myWS.var("x")->setBins(1);

// Signal and background distribution of the observable
myWS.factory("Uniform::sigPdf(x)");
myWS.factory("Uniform::bkgPdf(x)");

// S+B and B-only models (both extended PDFs)
myWS.factory("SUM::model(S*sigPdf,B*bkgPdf");
myWS.factory("ExtendPdf::modelBkg(bkgPdf,B)");

It said SUM::model is an extended PDF. I think it is pretty strange. Because an extended PDF is (Poisson) * (original pdf), where (Poisson) is

(Poisson) = (e^(-\mu))*\mu^N/N!

I cannot find (Poisson) term in SUM::model. Does RooStat automatically convert that shape of PDF into extended pdf?

Thank you.

Hi @june0812,

welcome to the ROOT forum!

First of all, in your snippet you forgot to define also the S and B parameters, and in the line with SUM there is a closing bracket missing in the factory expression.

Your corrected code snippet would look like this:

    RooWorkspace myWS{"myWS"};

    myWS.factory("x[0,1]"); // arbitrary range [0,1] 
    myWS.var("x")->setBins(1);

    myWS.factory("S[1000, 0, 10000]");
    myWS.factory("B[2000, 0, 10000]");

    // Signal and background distribution of the observable
    myWS.factory("Uniform::sigPdf(x)");
    myWS.factory("Uniform::bkgPdf(x)");

    // S+B and B-only models (both extended PDFs)
    myWS.factory("SUM::model(S*sigPdf,B*bkgPdf)");
    myWS.factory("ExtendPdf::modelBkg(bkgPdf,B)");

    myWS.Print();

    myWS.pdf("model");

Under the hood, the SUM operation creates a RooAddPdf, which supports extended likelihoods via the RooAbsPdf::expectedEvents() and RooAbsPdf::extendedTerm() interfaces (the latter returns the Poisson term that you were looking for).

You don’t need to worry about this, it will be used under the hood by the RooAbsPdf::fitTo() routine if you pass the RooFit::Expected() command argument.

    // get the model and set of observables from the workspace
    auto * model = myWS.pdf("model");
    RooArgSet obsSet(*myWS.var("x"));

    // check if the pdf can be extended
    std::cout << model->canBeExtended() << std::endl;

    // get the number of expected events
    // (should be 3000 with the values I used for N and S)
    std::cout << model->expectedEvents(&obsSet) << std::endl;

    // Evaluate Poisson term for the NLL given observed number of events
    int nObs = 3123;
    std::cout << model->extendedTerm(nObs, &obsSet) << std::endl;

I hope this helps, and don’t hesitate to ask further questions!
Jonas

1 Like

By the way, it looks like you want to create a binned counting experiment model here. For a single bin this is indeed possible to formulate manually in the workspace, but for bigger models you could look at HistFactory, which is extensively used in ATLAS for example. It’s an abstraction build on top of RooFit/RooStats to build a RooWorkspace for a binned model from a simple model specification.

Maybe a good point to get started with HistFactory would be this tutorial:
https://root.cern.ch/doc/master/hf001__example_8C.html

But maybe your model is simple enough that you can do without it and use pure RooFit.

Jonas

1 Like

Thanks for the reply. I have an additional question about it. When hypotestcalculator use CLs method, it should calculate profile likelihood ratio. In that likelihood, is Poisson term included?

Actually, I wanted to make unbinned analysis. Here is core part of my code:

(... read root file and do something ...)

    RooWorkspace* w = new RooWorkspace("w", "workspace");

    // obs
    w->factory("Rarity[0,1]");

    // pdfs
    w->factory("EXPR::bkg('(1-Rarity)*exp(Rarity*a+Rarity*Rarity*b)',Rarity,a[-1.883],b[1.09])");
    w->factory("Polynomial::sig(Rarity)");

    // systematics
    w->factory("Gaussian::add_sub(0,alpha[-10.0,10.0],1.39)");
    w->factory("Gaussian::mul_sub(0,beta[-0.5,0.5],0.0555)");

    w->factory("sig_eff[0.0014]");
    w->factory("NBB[330000000]");
    w->factory("sigBR[0.00008, 0.0, 0.0005]");
    w->factory("prod::sig_yield(2,NBB,sig_eff,sigBR)");
    w->factory("expr::sig_yield_with_unc('sig_yield*(1+beta)+alpha', sig_yield, alpha, beta)");
    w->factory("bkg_yield[1900, 1700, 2100]");
    w->factory("SUM::model(sig_yield*sig, bkg_yield*bkg)");
    w->factory("PROD::model_with_unc(model,add_sub,mul_sub)");

    RooStats::ModelConfig mc("ModelConfig", w);
    mc.SetPdf(*w->pdf("model_with_unc"));
    mc.SetParametersOfInterest(*w->var("sigBR"));
    mc.SetNuisanceParameters(RooArgSet(*w->var("alpha"), *w->var("beta"), *w->var("bkg_yield")));

    w->import(*d_Rarity, RooFit::Rename("observed_data"));
    w->import(mc);
    w->Print();
    w->writeToFile("CLs_test.root");

, where alpha and beta are additive and multiplicative systematic uncertainties, respectively. Signal pdf is flat function (0th order polynomial). Background pdf is slightly complicated function ((1-Rarity)*exp(Rarity*a+Rarity*Rarity*b). When I use hypotestcalculator, I did

(... read workspace...)

	RooStats::ModelConfig* sbModel = (RooStats::ModelConfig*)w->obj("ModelConfig");
	RooStats::ModelConfig* bModel = (RooStats::ModelConfig*)sbModel->Clone("BonlyModel");
	RooRealVar* poi = (RooRealVar*)bModel->GetParametersOfInterest()->first();
	poi->setVal(0);
	bModel->SetSnapshot(*poi);

	RooStats::FrequentistCalculator hybCalc(*data, *bModel, *sbModel);
	RooStats::ProfileLikelihoodTestStat* plr = new RooStats::ProfileLikelihoodTestStat(*sbModel->GetPdf());
	plr->SetOneSided(true);

	RooStats::ToyMCSampler* toymcs = (RooStats::ToyMCSampler*)hybCalc.GetTestStatSampler();
	toymcs->SetTestStatistic(plr);
	hybCalc.SetToys(1000, 1000);

(... do something...)

Then, I can obtain upper limit from CLs method. Does profile likelihood, which is used for CLs method, include Poisson term? ((e^(-\mu))*\mu^N/N!)

Thank you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.