Pdf doesn't normalize to number of events in dataset when drawing plots

Dear experts,
I am trying to use a RooExtendPdf to fitTo a dataset. After the fit is done, I draw the dataset and the pdf in the same frame, but it seems the pdf did not normalize to the number of events in that dataset correctly. angular_reco_cosl_bin3_Index_1.pdf (26.9 KB)

The fit results seems reasonable. So can you tell me what cause this problem?

The following are some segments of my code:

RooRealVar Q2("Q2","q^{2}",1.0,22.);
RooRealVar fh("fh", "F_{H}", Ifh, 0., 3. );
RooRealVar afb("afb", "A_{FB}", Iafb, -1.5, 1.5);
RooRealVar nsig("nsig","nsig",1E6,1E1,1E9);
TFile *KDEfile = new TFile( "./KDE/KDEeff.root","READ" );
RooWorkspace* ws_eff = (RooWorkspace*)KDEfile->Get( Form("ws_eff_bin%i",iBin) );
RooRealVar* ctL = (RooRealVar*)ws_eff->var("ctL");
RooGenericPdf* KDEeff = (RooGenericPdf*)ws_eff->pdf( Form("KDEtotaleff_bin%i",iBin) );
RooArgSet f_ang_argset(*ctL);
f_ang_argset.add(RooArgSet(fh,afb));	
TString f_ang_format;
f_ang_format = "( 0.75*(1-fh)*(1-ctL*ctL) + 0.5*fh + afb*ctL )";
RooGenericPdf* f_ang = new RooGenericPdf("f_ang","angular pdf", f_ang_format,f_ang_argset);
RooGenericPdf* f_sig = new RooGenericPdf("f_sig","signal pdf","@0*@1",RooArgList(*KDEeff,*f_ang));
RooExtendPdf  f("f","", *f_sig, nsig);
RooDataSet *data = new RooDataSet("data","data",RooArgSet(*ctL,Q2));  
double q2=0;
double cosTheta=0;
int n_ch=0;
ch->SetBranchAddress("Q2",&q2);
ch->SetBranchAddress("CosThetaL",&cosTheta);
n_ch=ch->GetEntries();
double q2Low[13] = {1.00, 2.00, 4.30, 8.68, 10.09, 12.86, 14.18, 16.00, 18.00, 1.00, 1.00, 10.09, 14.18};
double q2High[13] = {2.00, 4.30, 8.68, 10.09, 12.86, 14.18, 16.00, 18.00, 22.00, 6.00, 8.68, 12.86, 22};
for(int evt=0; evt<n_ch; evt++){
  ch->GetEntry(evt);
  if( (q2<q2Low[iBin])||(q2>q2High[iBin]) )continue;
  ctL->setVal(cosTheta);
  Q2.setVal(q2);
  data->add(RooArgSet(*ctL,Q2));
}
double ne = data->sumEntries();
cout<<"number of processing entries: "<<ne<<endl;
RooFitResult *f_fitresult = f.fitTo(*data,Extended(kTRUE),Save(kTRUE),Minimizer("Minuit"),Warnings(-1), PrintEvalErrors(-1));
TCanvas* c = new TCanvas("c");	
RooPlot* framecosl = ctL->frame(); 
data->plotOn(framecosl,Binning(100)); 
f.plotOn(framecosl); 
framecosl->SetTitle("");
framecosl->SetMinimum(0);
framecosl->SetTitleOffset(1.1,"Y");
framecosl->SetMaximum(framecosl->GetMaximum() * 1.25);
framecosl->Draw();
c->Print("./plots/cosl.pdf");

the “KDEeff” above is a RooGenericPdf whose function form is “[0]*[1]/[2]/[3]”. the [0],[1],[2],[3] indicates four RooKeysPdfs.

Thank you

Hi @jcq,

is this a multi-dimensional PDF that has to be projected onto the 1-D coordinate system? I could imagine that the projection integrals run over more/less than one would assume.

1 Like

Yes, I found it.

  • What is the range set for Q2? For projecting onto the cosThetaL axis, Q2 has to be integrated.
  • Does RooFit say something about the plotting? There are some cases that cannot be handled automatically.

Further, you can try one of the following:

  • ProjectionRange. Maybe you need to set a range for Q2
  • ProjWData. Instead of integrating, evaluate the PDF at each point in the dataset.

See here for those two approaches:
https://root.cern.ch/doc/master/classRooAbsPdf.html#a7f01ccfb4f3bc3ad3b2bc1e3f30af535

1 Like

Hi StephanH,

Thank you for your reply. As you mention, it should be 1D pdf fitTo 1D dataset. Now I have correct the dataset to 1-dimensional, and redo the fit. But the fit is time-consuming, so now I am waiting to get the new plot.

Thank you

Hi StephanH,

  • The q2Lows and q2Highs are the ranges to select data to perform the fit separately on different physics area.

  • RooFit didn’t say anything about the plotting.

Q2 should not be added in the dataset, it was my mistake.

By the way, I do the fit with random number as the initial value of afb and fh, and I iterate the fit for 50 times. Beacause the fit function’s component “KDEeff”, each fit takes a lot of time on doing:

[#1] INFO:NumericIntegration – RooRealIntegral::init(f_sig_Int[ctL]) using numeric integrator RooIntegrator1D to calculate Int(ctL)
[#1] INFO:NumericIntegration – RooRealIntegral::init(KDEtotaleff_bin3_Int[ctL]) using numeric integrator RooIntegrator1D to calculate Int(ctL)
[#1] INFO:NumericIntegration – RooRealIntegral::init(f_ang_Int[ctL]) using numeric integrator RooIntegrator1D to calculate Int(ctL)
[#1] INFO:Minization – RooMinimizer::optimizeConst: activating const optimization
[#1] INFO:NumericIntegration – RooRealIntegral::init(KDEtotaleff_bin3_Int[ctL]) using numeric integrator RooIntegrator1D to calculate Int(ctL)
[#1] INFO:Minization – The following expressions have been identified as constant and will be precalculated and cached: (KDEtotaleff_bin3)

Do you know some way to make the iteration of fit more efficient?

Thank you.

So this probably means that an integral over Q2 ran to project out the component, but Q2 is actually a parameter that should have stayed constant when plotting the PDF. Seems to make sense.

Regarding the numeric integrals:
These can be expensive operations, depending on what has to be integrated. There are not so many options:

  • You can reduce the accuracy of the integrals to make them stop earlier. Depending on how much the PDF relies on the accuracy of the integrals, that can be dangerous or not a problem. You would have to test …
    See some examples here:
    https://root.cern.ch/doc/master/rf901__numintconfig_8C.html
  • Where possible, you could use RooFit classes that have analytical integration. The KDEs support it, also the RooProduct, but not the divisions. So I guess that you are out of luck here.
  • I worked on a faster integrator, but my prototype only works for a few PDFs for now (the “GenericPdf” is one of them, though :slight_smile:). It needs more work and thorough testing.
1 Like

Your advice is helpful.
Thanks a lot!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.