ToyMC for Simultaneous vs Non-simultaneous fit

Hi all,

So I have been trying to implement MCToys to do fit validation on two different fitting regimes, one where we do a ‘global fit’ and another where we do a simultaneous fit on the eta, segregated into eta bins.

I’m coming across a problem in PyROOT where I can’t seem to have two ROOT.RooMCStudy objects in the same script, for one simultaneous and one non-simultaneous fit on the same independent variable. I have been able to do this for two regular (non-sim.) fits so I think the problem lies in the implementation of this.

Here are some relevant code snippets

# ... defining global fit pdf (omitted)
pdf_GF = w.pdf('model_GF')

# .... defining simultaneous fit pdfs (omitted)
w.factory("SUM:modelR1(Sig_yieldR1[{},0,{}]*signalR1, Bck_yieldR1[{},0,{}]*bkg_pdf2)".format(n_sigR1,n_eventsR1,n_bkgR1,n_eventsR1))
w.factory("SUM:modelR2(Sig_yieldR2[{},0,{}]*signalR2, Bck_yieldR2[{},0,{}]*bkg_pdf2)".format(n_sigR2,n_eventsR2,n_bkgR2,n_eventsR2))
w.factory("SUM:modelR3(Sig_yieldR3[{},0,{}]*signalR3, Bck_yieldR3[{},0,{}]*bkg_pdf2)".format(n_sigR3,n_eventsR3,n_bkgR3,n_eventsR3))

sample = ROOT.RooCategory("sample", "sample")
sample.defineType("R1")
sample.defineType("R2")
sample.defineType("R3")

modelR1 = w.pdf('modelR1')
modelR2 = w.pdf('modelR2')
modelR3 = w.pdf('modelR3')

simPdf = ROOT.RooSimultaneous("simPdf", "simultaneous pdf", sample)
simPdf.addPdf(modelR1, "R1")
simPdf.addPdf(modelR2, "R2")
simPdf.addPdf(modelR3, "R3")

B_mass = w.arg('B_mass')

mcstudy_Global = ROOT.RooMCStudy(pdf_GF, ROOT.RooArgSet(B_mass), ROOT.RooFit.Silence(), ROOT.RooFit.FitOptions(ROOT.RooFit.Save(ROOT.kTRUE), ROOT.RooFit.PrintEvalErrors(0), ROOT.RooFit.Extended(ROOT.kTRUE) ,ROOT.RooFit.NumCPU(100)))
mcstudy_ERF    = ROOT.RooMCStudy(simPdf, ROOT.RooArgSet(B_mass, sample), ROOT.RooFit.Silence(), ROOT.RooFit.FitOptions(ROOT.RooFit.Save(ROOT.kTRUE), ROOT.RooFit.PrintEvalErrors(0), ROOT.RooFit.Extended(ROOT.kTRUE) ,ROOT.RooFit.NumCPU(100)))

And I get a crash with the last error message of

  • RooVectorDataStore::addColumn(RooAbsArg&, Bool_t): Assertion `numEvt != 0’ failed.
    *** Break *** abort

Can anyone provide any insight into this or how I can go about implementing this without having to do the toys manually? To add the toys work individually, I just can’t run them in the same script.

Many thanks,
Nathan

Dear @nheatley ,

I believe @jonas should be able to help.

Cheers,
Vincenzo

Hi @nheatley, welcome to the ROOT forum!

I just tried out the ToyMCStudy with both an extended RooGaussian and also the same pdf wrapped in a RooSimultaneous, and I don’t see the problem. Here is my standalone test.

using namespace RooFit;

RooWorkspace ws;
// The single-channel pdf
ws.factory("Gaussian::rawPdf(x[0, 0, 10], mu[5.0, 0, 10], sigma[2.0, 0.1, 10])");
// Make it extended such that the RooSimultaneous knows how many events to
// create in each channel
ws.factory("ExtendPdf::extPdf(rawPdf, n[10000, 1000, 1000000])");
// Create the RooSimultaneous
ws.factory("SIMUL::simPdf( cat[A=0], A=extPdf)");

auto &x = *ws.var("x");
auto &rawPdf = *ws.pdf("rawPdf");
auto &extPdf = *ws.pdf("extPdf");
auto &simPdf = *ws.pdf("simPdf");
auto &cat = *ws.cat("cat");

// For the single-channel extended pdf
RooMCStudy mcstudy1{extPdf,        {x},       Extended(true),
                    Binned(false), Silence(), FitOptions(Save(true), PrintEvalErrors(0), PrintLevel(-1))};
mcstudy1.generateAndFit(10);

// For the simulatneous pdf
RooMCStudy mcstudy2{simPdf,        {x, cat},  Extended(true),
                    Binned(false), Silence(), FitOptions(Save(true), PrintEvalErrors(0), PrintLevel(-1))};
mcstudy2.generateAndFit(10);

// Both works for me (I tried ROOT 6.28.04)

So to understand your problem, we need more information.

  • Which ROOT version are you using?
  • You told us that the last error message was. But what is the full log?

Cheers,
Jonas

Hi Jonas,

Switching to version 6.28 for ROOT seems to have done the trick, thanks! :slight_smile:

I have another question if you could help? I want to be able to fit the same generated data as simultaneous pdf and a single-channel pdf. So instead of doing

mcstudy_Global = ROOT.RooMCStudy(pdf_GF, ROOT.RooArgSet(B_mass), ROOT.RooFit.Silence(), ROOT.RooFit.FitOptions(ROOT.RooFit.Save(ROOT.kTRUE), ROOT.RooFit.PrintEvalErrors(0), ROOT.RooFit.Extended(ROOT.kTRUE) ,ROOT.RooFit.NumCPU(100)))
mcstudy_ERF    = ROOT.RooMCStudy(simPdf, ROOT.RooArgSet(B_mass, sample), ROOT.RooFit.Silence(), ROOT.RooFit.FitOptions(ROOT.RooFit.Save(ROOT.kTRUE), ROOT.RooFit.PrintEvalErrors(0), ROOT.RooFit.Extended(ROOT.kTRUE) ,ROOT.RooFit.NumCPU(100)))
mcstudy_ERF.generate(nToys, n_events, ROOT.kTRUE)
mcstudy_Global.generateAndFit(nToys, n_events, ROOT.kTRUE) 	

I want to do something like

mcstudy_ERF.generate(nToys, n_events, ROOT.kTRUE)
mcstudy_ERF.fit(nToys)
mcstudy_Global.fit(nToys, *Import same data to be fit on as sim. fit*,ROOT.kTRUE) 

Any clue how I would go about doing this? I’m not sure how to fit the different MC studies on the same generated dataset and I feel there will be issues with trying to fit data that has been generated in categories as a single channel fit.

One idea is to wrap my non-sim. fit in a simultaneous wrapper that doesn’t affect the fit itself but I’m not sure if this would affect the results.

Thanks for the speedy responses thus far :slight_smile:
Cheers,
Nathan

You’re almost doing it right there, the annoying part is that you need to clone the datasets because RooMCStudy::fit() takes ownership of them. I have extended my previous example to show how this would work:

void demo2()
{
   using namespace RooFit;

   RooWorkspace ws;
   // The single-channel pdf
   ws.factory("Gaussian::rawPdf(x[0, 0, 10], mu[5.0, 0, 10], sigma[2.0, 0.1, 10])");
   // Make it extended such that the RooSimultaneous knows how many events to
   // create in each channel
   ws.factory("ExtendPdf::extPdf(rawPdf, n[10000, 1000, 1000000])");
   // Create the RooSimultaneous
   ws.factory("SIMUL::simPdf( cat[A=0], A=extPdf)");

   auto &x = *ws.var("x");
   auto &rawPdf = *ws.pdf("rawPdf");
   auto &extPdf = *ws.pdf("extPdf");
   auto &simPdf = *ws.pdf("simPdf");
   auto &cat = *ws.cat("cat");

   int nSamples = 10;

   // For the simulatneous pdf
   RooMCStudy mcstudy2{simPdf,         {x, cat},
                       Extended(true), Binned(false),
                       Silence(),      FitOptions(Save(true), PrintEvalErrors(0), PrintLevel(-1), Optimize(1))};
   mcstudy2.generate(nSamples, 0, true);

   // Get the generated datasets into TLists
   TList toyDatas1;
   TList toyDatas2;
   for (std::size_t i = 0; i < nSamples; ++i) {
      // Need to clone the datasets twice, because RooMCStudy::fit() will take ownership of the data and we'll call this
      // function twice.
      auto data1 = static_cast<RooAbsData *>(mcstudy2.genData(i)->Clone());
      auto data2 = static_cast<RooAbsData *>(mcstudy2.genData(i)->Clone());
      toyDatas1.Add(data1);
      toyDatas2.Add(data2);
   }

   mcstudy2.fit(nSamples, toyDatas2);

   // For the single-channel extended pdf
   RooMCStudy mcstudy1{extPdf,         {x},
                       Extended(true), Binned(false),
                       Silence(),      FitOptions(Save(true), PrintEvalErrors(0), PrintLevel(-1), Optimize(1))};
   mcstudy1.fit(nSamples, toyDatas1);
}

In Python, you don’t need these static_casts of course.

Let me know if you have any further questions!
Jonas

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.