Roofit Simultaneous Template fit on two different variables

Dear all,

what I want to do is a template fit of a signal (S) from two background (B1 and B2). In order to discriminate the two background from the signal, I must use two different variables (x, y). So I have 6 RooDataHist (x-S, x-B1, x-B2, y-S, y-B1 and y-B2), and I want to perform a simultaneous template fit with the same normalization factors for the distributions on x and y variables.

I think I must use RooSimultaneous as described in rf501_simultaneouspdf.C tutorial. However, I am confused by the syntax, since all examples I found start from a RooDataSet instead of a RooDataHist. I guess this is a stupid or ill-posed question, but I am not able to find a solution.

Could someone help me?

Hi @kormoranos,

thanks for asking!

I guess what confuses you is how the RooHistPdfs for the template need to be created. Indeed, they are created from RooDataHists based on your histogram shapes, but these are not the actual datasets for fitting. Just for creating the templates.

For fitting, you can as usual pass any RooDataSet or RooDataHist to RooAbsPdf::fitTo(), depending on your needs.

Here is how I would implement such a fit (I create the underlying histograms for the templates on the fly, sampling the histograms from some functions):

// templateFitExample.C

const double xmin = 0.;
const double xmax = 10.;
const int nbins = 50;

struct TemplateInfo {
  std::unique_ptr<TH1> histo;
  std::unique_ptr<RooDataHist> dataHist;
  std::unique_ptr<RooHistPdf> histPdf;
};

// Function to quickly fill a RooHistPdf template that is based on a histogram
// sampled from a function with "formulaName". It does all the needed steps and
// returns you all temporary objects too for inspection:
//
// * The TH1 for the underlying shape
// * A RooDataHist that was created from the TH1 (not used for fitting, only to
//   specify the template!)
// * A RooHistPdf based on the RooDataHist that is a binned template PDF
TemplateInfo createTemplate(RooRealVar const& x,
                            const char* name,
                            const char* formulaName,
                            int nSamples) {
  TemplateInfo out;
  out.histo = std::make_unique<TH1D>(name, name, nbins, xmin, xmax);
  out.histo->FillRandom(formulaName, nSamples);
  out.dataHist =
      std::make_unique<RooDataHist>(name, name, x, out.histo.get());
  out.histPdf =
      std::make_unique<RooHistPdf>(name, name, x, *out.dataHist);

  return out;
}

void templateFitExample() {
  using namespace RooFit;

  // Specify the shapes to sample the templates
  TF1 f0("f0", "gaus", xmin, xmax);
  f0.SetParameters(1.0, 5, 2.0);
  TF1 f1("f1", "gaus", xmin, xmax);
  f1.SetParameters(1.0, 5, 1.0);
  TF1 f2("f2", "1", xmin, xmax);
  TF1 f3("f3", "expo", xmin, xmax);
  f3.SetParameters(1.0, -0.2);

  // Specify the model in RooFit and generate some toy data
  RooRealVar x{"x", "x", xmin, xmax};

  // How many toys to sample for creating the templates
  int nSamples = 1000000;

  auto templ1Sig = createTemplate(x, "chan1_sig", "f0", nSamples);
  auto templ1Bg1 = createTemplate(x, "chan1_bg1", "f2", nSamples);
  auto templ1Bg2 = createTemplate(x, "chan1_bg2", "f3", nSamples);

  auto templ2Sig = createTemplate(x, "chan2_sig", "f1", nSamples);
  auto templ2Bg1 = createTemplate(x, "chan2_bg1", "f2", nSamples);
  auto templ2Bg2 = createTemplate(x, "chan2_bg2", "f3", nSamples);

  RooRealVar nSig{"n1_sig", "n1_sig", 10000, 0, 100000};
  RooRealVar nBg1{"n1_bg1", "n1_bg1", 10000, 0, 100000};
  RooRealVar nBg2{"n1_bg2", "n1_bg2", 10000, 0, 100000};

  RooAddPdf model1{
      "model1",
      "model1",
      {*templ1Sig.histPdf, *templ1Bg1.histPdf, *templ1Bg2.histPdf},
      {nSig, nBg1, nBg2}};

  RooAddPdf model2{
      "model2",
      "model2",
      {*templ2Sig.histPdf, *templ2Bg1.histPdf, *templ2Bg2.histPdf},
      {nSig, nBg1, nBg2}};

  RooCategory category("cat", "cat");
  category.defineType("cat1");
  category.defineType("cat2");

  RooSimultaneous model{"model", "model", category};
  model.addPdf(model1, "cat1");
  model.addPdf(model2, "cat2");

  std::unique_ptr<RooDataSet> data1{model1.generate(x)};
  std::unique_ptr<RooDataSet> data2{model2.generate(x)};
  data1->SetName("data1");
  data2->SetName("data2");

  // A bit annoying, but we have to create a weight variable ourselves to
  // transfer the weights from the component datsets
  RooRealVar w("weight", "weight", 1.0);
  RooDataSet data{
      "data",
      "data",
      {x, w},
      Index(category),
      Import({{"cat1", data1.get()}, {"cat2", data2.get()}}),
      WeightVar("weight")};

  data1->Print();
  data2->Print();
  data.Print();

  // Fit!
  std::unique_ptr<RooFitResult> res{
      model.fitTo(data, Save(), PrintLevel(-1))};
  res->Print();

  // Do some plotting
  auto c1 = new TCanvas("c1", "c1", 800, 400);
  c1->Divide(2);

  c1->cd(1);
  auto frame1 = x.frame(Title("channel 1"));
  data1->plotOn(frame1);
  model1.plotOn(frame1);
  frame1->Draw();

  c1->cd(2);
  auto frame2 = x.frame(Title("channel 2"));
  data2->plotOn(frame2);
  model2.plotOn(frame2);
  frame2->Draw();

  c1->Draw();
}

If you change now the TH1s are created (reading from files instead of creating on the fly), you have your analysis I think.

Have a nice weekend!

Cheers,
Jonas

Hi @jonas , thank you very much for the detailed reply! I will try to implement you solution in the next days and let you know if I succeed in my intent.

Hi @jonas,

I tried to adapt your code to my case.

However, I am confused by the fact that in your case you use two different distributions but both on the “x” variable, whereas in my case I have one on “x” and one on “y” variables. I did something like that:

[…]
std::unique_ptr data1{model1.generate(x)};
std::unique_ptr data2{model2.generate(y)};
[…]

[…]
RooDataSet data{
“data”,
“data”,
{x, y, w},
Index(category),
Import({{“cat1”, data1.get()}, {“cat2”, data2.get()}}),
WeightVar(“weight”)};
[…]

The fit is executed, but the result is not reasonable. So I am wondering if my understanding is completely wrong. I am sorry if this is a stupid question.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.