Using RooStats to compute upper limit

nastein · April 22, 2021, 6:44pm

Hello, I’m trying to use RooFit and RooStats to compute the upper limit on a cross section of a new physics process with the hypotestinverter.

To start with I am using ROOT 6.25/01

I have read many of the tutorials and a lot of forum posts on this but am still slightly confused. The start of my analysis begins with simulated histograms of my observable (here it is an invariant mass, M_inv) for both my background and my signal processes. They are created as below

   //Get Histograms of signal and background distributions
   TH1D *s500_hist = (TH1D*) s500->Get("M_{#gamma#gamma}35.000000");
   TH1D *LBL500_hist = (TH1D*) LBL500->Get("M_{#gamma#gamma}bkgd35.000000");

   RooDataHist s500_dh("minv", "minv", M_inv, Import(*s500_hist));
   RooDataHist LBL500_dh("minv", "minv", M_inv, Import(*LBL500_hist));

   RooCategory sample("sample","");
   sample.defineType("signal");
   sample.defineType("bkgd");

   //two datahists, one containing sig + bkgd, one containing bkgd only 
   RooDataHist combData("combData", "combined data", M_inv, Index(sample), Import("signal",s500_dh), Import("bkgd",LBL500_dh));
   RooDataHist combData2("combData2", "combined data2", M_inv, Index(sample), Import("bkgd",LBL500_dh));

I create background and signal pdfs and then create a s+b modelconfig as well as a b_only modelconfig.

//Create bgkd model config
   ModelConfig bmodel("bmodel", wks);
   bmodel.SetPdf(*wks->pdf("model"));
   bmodel.SetObservables(obs);
   bmodel.SetParametersOfInterest(poi);
   wks->var("xsec")->setVal(0.0);
   //bmodel.SetGlobalObservables(global_obs);
   //bmodel.SetNuisanceParameters(nuis);
   bmodel.SetSnapshot(poi);

   //Create sig + bkgd model config
   ModelConfig sbmodel("sbmodel", wks);
   sbmodel.SetPdf(*wks->pdf("model"));
   sbmodel.SetObservables(obs);
   sbmodel.SetParametersOfInterest(poi);
   wks->var("xsec")->setVal(.2);// Does this value matter?
   //bmodel.SetGlobalObservables(global_obs);
   //bmodel.SetNuisanceParameters(nuis);
   sbmodel.SetSnapshot(poi);

Now when I want to go and set an upper limit I first need to create a calculator (either Asymptotic or Frequentist). I input my s+b model as my null, and my b model as my alt. Exactly what data should I input for the data here (as below)?

FrequentistCalculator fc(data, bmodel, sbmodel);

Should it be my simulated background? my simulated signal? signal + background? Or maybe I’m supposed to be generating Asimov data and using that data here? Does it depend on which calculator I want?

I’ve attached my analysis code, it doesn’t contain the root files but those can be easily included. Any help would be much appreciated. Thank you!

moneta · April 23, 2021, 1:11pm

Hi,

If you want to compute an upper limit using the HypoTestInverter and for example the CLs procedure, the best way is to create a Workspace and then use the macro StandardHypoTestInvDemo.C present in the roostats tutorials, `tutorials/roostats/StandardHypoTestInvDemo.C. See also ROOT: tutorials/roostats/StandardHypoTestInvDemo.C File Reference
More info in the slide that I attach below.

If you want to do the procedure yourself, you need to create a calculator (e.g. Asymptotic or Frequentist) and pass in order the data set, the alternative model (that for limit calculation is the B model) and the null model (the S+B model for limit).
You need to use the observed data set, the RooStats tool will compute also the expected limit genereting an Asymov data set from your model.
See the tutorial above how the expected limit is computed.

Cheers

Lorenzo

nastein · April 23, 2021, 1:15pm

@moneta,

Thank you for your reply Lorenzo,

This slide I have seen before and I understand the procedure for constructing my own calculator and feeding it into a HypoTestInverter. What I am asking is, what is the “observed data set” here? Is it the background only “data” from my simulation? Or should I generate binned data from the background pdf after first fitting? This is the part that is unclear to me.

moneta · April 23, 2021, 2:39pm

Hi,

If you want to compute a limit on real data, is the data collected by your experiment, where you want to estimate your limit. Otherwise it could also be a simulated data used to test your procedure

Lorenzo

system · May 7, 2021, 2:40pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.