Problem with 2D simultaneus fit

Dear all,
I am trying to perform the 2-dimensional simultaneous fits to extract the signal yield using the data of the Mbc signal and sideband regions. But, I am getting following error messages:

[#0] WARNING:Minization -- RooMinuit::RooMinuit: removing parameter sample from list because it is not of type RooRealVar

[#1] INFO:Minization -- RooMinuit::optimizeConst: activating const optimization

[#0] ERROR:InputArguments -- RooTreeData::split(data) ERROR category sample doesn't depend on any variable in this dataset

[#0] ERROR:Fitting -- RooAbsTestStatistic::initSimMode(nll_simPdf_data) ERROR: index category of simultaneous pdf is missing in dataset, aborting

Exception: RooAbsTestStatistic::initSimMode() ERROR, index category of simultaneous pdf is missing in dataset, aborting

Any idea how to fix it? Please see below code snippet.
Thanks,
V. Prasad

   #ifdef __CINT__
  gROOT->ProcessLineSync(".L  RooRevArgusBG.cxx+") ;
  #endif

  RooRealVar Dikmass("Dikmass","m_{K^{+}K^{-}} (GeV/c^{2})",0.987354,1.05);
  RooRealVar Omegamass("Omegamass","m_{#pi^{+}#pi^{-}#pi^{0}} (GeV/c^{2})", 0.65, 0.85);
  RooRealVar mBC("mBC","m_{#pi^{+}#pi^{-}#gamma #gamma} (GeV/c^{2})",1.84, 1.8865);

  TFile f2("../D0omegaphitrans_.root");
  TTree* truth = (TTree*) gDirectory->Get("truth") ;

  RooDataSet   hist1("hist1","data",Dikmass,Import(*truth));
  RooKeysPdf   signal("signal","kest1",Dikmass,hist1,RooKeysPdf::MirrorBoth) ;

  RooDataSet   hist2("hist2","data",Omegamass,Import(*truth));
  RooKeysPdf   etasignal("etasignal","kest1",Omegamass,hist2,RooKeysPdf::MirrorBoth) ;

  RooRealVar c0("c0","c0",1.05);
  RooRealVar c1("c1","c1", -3.39594e+01,-40.0,40.0);
  RooRealVar c2("c2","c2",1.61023e+00,-8.0,8.0);
  RooRevArgusBG bkg("bkg","bkg", Dikmass, c0, c1, c2);


  RooRealVar p0("p0","p0",0.8453);
  RooRealVar p1("p1","p1", 8.87520e-01,-100.0,100.0);
  RooRealVar p2("p2","p2",1.99977e+00,-2.0,2.0);
  RooArgusBG back("back","Argus1",Omegamass,p0, p1, p2);

  RooProdPdf fx0("fx0", "final signal pdf", RooArgList(signal,etasignal));
  RooProdPdf fx1("fx1", "final signal pdf", RooArgList(signal,back));
  RooProdPdf fx2("fx2", "final signal pdf", RooArgList(bkg,etasignal));
  RooProdPdf fx3("fx3", "final back pdf",   RooArgList(bkg, back));

  RooRealVar nsig("nsig","number of signal events",0,0,1e6) ;
  RooRealVar npeakKK("npeakKK","number of background events",36.225,0,1e6) ;
  RooRealVar npeaketa("npeaketa","number of background events",16.356,0,1e6) ;
  RooRealVar nbkg("nbkg","number of background events",89,0,1e6) ;

  RooAddPdf sum("sum","sig +back",RooArgList(fx0, fx1, fx2, fx3),RooArgList(nsig, npeakKK, npeaketa, nbkg)) ;

  RooRealVar nsigsb("nsigsb","number of signal events",0,0,1e6) ;
  RooRealVar npeakKKsb("npeakKKsb","number of background events",36.225,0,1e6) ;
  RooRealVar npeaketasb("npeaketasb","number of background events",16.356,0,1e6) ;
  RooRealVar nbkgsb("nbkgsb","number of background events",89,0,1e6) ;

  RooAddPdf sumsb("sumsb","sig +back",RooArgList(fx0, fx1, fx2, fx3),RooArgList(nsigsb, npeakKKsb, npeaketasb, nbkgsb)) ;

  TFile f("../data.root");
  TTree* tree = (TTree*) gDirectory->Get("tree") ;

  RooDataSet data("data","data",RooArgSet(Dikmass, Omegamass, mBC),Import(*tree),Cut("mBC > 1.859 && mBC < 1.871"));

  RooDataSet dataSB("dataSB","data",RooArgSet(Dikmass, Omegamass, mBC),Import(*tree),Cut("mBC < 1.857 || mBC > 1.873"));

  RooCategory sample("sample", "sample");
  sample.defineType("mbcSG");
  sample.defineType("mbcSB");

  RooDataSet combData("combData", "combined data", RooArgSet(Dikmass, Omegamass, mBC), 
 Index(sample), Import("mbcSG", data),
                      Import("mbcSB", dataSB));

  RooSimultaneous simPdf("simPdf", "simultaneous pdf", sample);
  simPdf.addPdf(sum,   "mbcSG");
  simPdf.addPdf(sumsb, "mbcSB");
  simPdf.fitTo(data);
1 Like

I think @StephanH will be able to help you

1 Like

I cannot run this myself, but try to add sample to the variables of the dataset
RooArgSet(Dikmass, Omegamass, mBC).

Dear Stephan,
Thanks for your kind suggestion. I have followed your suggestion. But, I am still getting the same kind of message. Please find my code snippet along with the data files in a tar.gz file attached to this mail.

Your help is highly desirable.

Thanks,
V. Prasad

simu.tar.gz (1.9 MB)

Here is what I change to get it running:

--- MLfit2d.C	2020-07-08 11:51:23.000000000 +0200
+++ test/MLfit2d.C	2020-07-08 15:06:18.000000000 +0200
@@ -67,15 +67,16 @@
   sample.defineType("mbcSG");
   sample.defineType("mbcSB");
 
-  RooDataSet combData("combData", "combined data", RooArgSet(Dikmass, Omegamass, mBC), Index(sample), Import("mbcSG", data),
+  RooDataSet combData("combData", "combined data", RooArgSet(Dikmass, Omegamass, mBC, sample), Index(sample), Import("mbcSG", data),
 		      Import("mbcSB", dataSB));
+  combData.Print("V");
 
   RooSimultaneous simPdf("simPdf", "simultaneous pdf", sample);
 
   simPdf.addPdf(sum,   "mbcSG");
   simPdf.addPdf(sumsb, "mbcSB");
 
-  simPdf.fitTo(data);
+  simPdf.fitTo(combData);
 
 
 }

Dear Stephan,
Thanks for your quick reply and help. Your this change works for me. I have one more question.

In conventional 2D fit for 3 variables (x, y, z), I usually extract the signal yields by performing the 2D fit to the y and z distributions in both x signal and sideband regions. Here we compute the background fraction f (= ratio of backgrounds in x signal and sideband regions) in x signal region by performing a 1D fit to the x distribution. The final signal yield is computed as,

Nsig = Nsig_{x signal-region} - f*Nsig_{x sideband region}

where Nsig_{x signal/sideband-region} is the number of signal yield obtained from a 2D in both x signal/sideband regions.

Now I want to do the same thing by performing a 2D simultaneous fit using the data of both x signal and sideband regions. Do I need to take care of background fraction f in this case too? If yes, how to do the same. Kindly clarify me.

Thanks,
V. Prasad

Do I understand correctly that you want to slice the fit space in x, and fit two or three regions in (y, z) simultaneously? You can do that by creating a simultaneous PDF with these three regions.
You can use the parameter f in all regions you want. Just create a RooProduct of any scale factors you need, or create a RooFormulaVar if the formulae are more complicated, e.g. (NSig + (1-f) * NBkg) etc.

Then, use those as scale factors in the RooAddPdf. You can have any scale factors you want for any region, but they can also share the parameter f, for example.

Note that the RooProdPdf is for cases where you want to multiply PDFs. It’s not for multiplicative scale factors.

Dear Stephan,
Thanks for your clarification. Yes, your understanding is correct. I will compute the final Nsig according to your suggested way.

Thanks,
V. Prasad

Hello Stephan,
I have used the following code snippet to produce the projection and pull plots of a particular variable in the simultaneous fit. Please find the corresponding plots from the attachment. While looking at the projection plot, it hard to believe that the pull distribution is correct. Am I doing something wrong? Kindly clarify me.

Thanks,
V. Prasad

  RooPlot* frame1 = Dikmass.frame(Bins(100)) ;
  frame1->SetName("SigPDF");
  combData.plotOn(frame1,Cut("sample==sample::mbcSG")) ;
  simPdf.plotOn(frame1,Slice(sample,"mbcSG"),ProjWData(sample,combData)) ;
simPdf.plotOn(frame1,Slice(sample,"mbcSG"),Components("fx0"),ProjWData(sample,combData),LineStyle(kDashed),LineColor(3)) ;
simPdf.plotOn(frame1,Slice(sample,"mbcSG"),Components("fx1"),ProjWData(sample,combData),LineStyle(kDashed),LineColor(2)) ;
  simPdf.plotOn(frame1,Slice(sample,"mbcSG"),Components("fx2"),ProjWData(sample,combData),LineStyle(kDashed),LineColor(4)) ;
  simPdf.plotOn(frame1,Slice(sample,"mbcSG"),Components("fx3"),ProjWData(sample,combData),LineStyle(kDashed),LineColor(7)) ;
  frame1->Draw() ;

  RooHist* resdkk = frame1->pullHist();
  RooPlot* frame1_ = Dikmass.frame(Title("Pull Distribution")) ;
  frame1_->SetName("resdKK");
  frame1_->addPlotable(resdkk,"P") ;

mkkdist.pdf (26.6 KB) pull.pdf (20.1 KB)

The fit looks ok, but the pulls are not pulls. The distribution looks exactly like data.

Here is an example how to create pulls:
https://root.cern.ch/doc/master/rf109__chi2residpull_8C.html

Yes, I have used the same method to create the pull, as can be seen below.

RooHist* resdkk = frame1->pullHist();
RooPlot* frame1_ = Dikmass.frame(Title(“Pull Distribution”)) ;
frame1_->SetName(“resdKK”);
frame1_->addPlotable(resdkk,“P”) ;

It seems that this code computes the fit residual as combData - fit. I have noticed that the fit does not describe well the data unless I require the following condition:

combData.plotOn(frame1,Cut(“sample==sample::mbcSG”)) ;

So I need to apply a similar cut before creating the pull. But, I don’t find any example for the same.

Thanks,
V. Prasad

1 Like

Yes, you have to cut, because otherwise it’s not clear which of the categories in the simultaneous PDF should be drawn. Can you confirm that the residual histogram does not work correctly when you first apply the cut to the data, then draw data and fit curves, and then create the pull histogram from the frame where you drew data and fit curves?

The pull and residual histograms are created by finding the first curve and the first dataset that have been drawn on the frame, unless you pass the names of the curves explicitly.
https://root.cern.ch/doc/master/classRooPlot.html#a0d3911db1f4e0ba5cfcf9e182eea3ed0

Thanks for your suggestion. The problem is solved after making the following changes:

RooPlot* frame1 = Dikmass.frame(Bins(100)) ;
combData.plotOn(frame1,Cut(“sample==sample::mbcSG”),Name(“datakksig”)) ;

simPdf.plotOn(frame1,Slice(sample,“mbcSG”),ProjWData(sample,combData),Name(“pdfkksig”)) ;

RooHist* resdkk = frame1->pullHist(“datakksig”,“pdfkksig”);
RooPlot* frame1_ = Dikmass.frame(Title(“Pull Distribution”)) ;
frame1_->SetName(“resdKK”);
frame1_->addPlotable(resdkk,“P”) ;

1 Like

:+1: !

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.