Treatment of “negative” pdf in low-statistics fits

Dear Experts,

I need to calculate pdf integrals in low statistics cases when the pdf seems to go negative or zero in part of the full fit range. When performing a RooChebychev pdf fit on few events, the slope goes negative in the considered fit-range. Whenever the polynomial would go negative (fig 1), it seems to be forced to zero, at least in plotting (fig 2). The fit converges properly with MINIMIZE status 0 and healthy covariance matrix (HESSE status 3).

The negative pdf values project into the pdf-integrals in a sub-range. I attach 4 examples of the this behaviour in 3 selected sub-regions and the full fit range (fig 3).

May you please clarify what happens in these instances and can we rely on such integrals? Could you please also explain what exactly happens to the pdf when we see plots like in fig 2?

I attach also the .cxx file to reproduce this problem:

problem_example.cxx (3.2 KB)

Thanks a lot for the help! Print of the code is below.

Sincerely,

Stanislav Biryukov

#include "TCanvas.h"
#include "TGraph.h"
#include "RooDataSet.h"
#include "RooRealVar.h"
#include "RooChebychev.h"
#include "RooExtendPdf.h"
#include "RooFitResult.h"
#include "RooFormulaVar.h"
#include "RooArgSet.h"
#include "RooPlot.h"
#include "RooHist.h"
#include "TLegend.h"
#include <iostream>
#include <utility>

void problem_example(){
  // Initialising fit parameters                                                                                                               
  RooRealVar n_fit("n_fit", "n_fit", 2, -100000., 100000. );
  RooRealVar c_fit("c_fit", "c_fit", -0.3, -50., 50. );
  RooRealVar x("x", "x", 4766., 5966. );
  // Initialising a test sample                                                                                                                
  RooDataSet set("set","set", RooArgSet(x));
  // Additing the two events to imitate low statistic sample                                                                                   
  x = 5075.;
  set.add(RooArgSet(x));
  x = 5076.;
  set.add(RooArgSet(x));
  // Performing the chebychev extended fit                                                                                                     
  RooChebychev chebychev_fit( "chebychev", "chebychev", x, RooArgList(c_fit));
  RooExtendPdf chebychev_fit_extended( "chebychev_fit_extended", "chebychev_fit_extended", chebychev_fit, n_fit);
  RooFitResult* res = chebychev_fit_extended.fitTo(set, RooFit::Extended(true), RooFit::Save());
  // Confirming the fit has converged                                                                                                          
  std::cout << "Stan: status converge    = " << res->status()  << std::endl;
  std::cout << "Stan: status cov matrix  = " << res->covQual() << std::endl;
  // Plotting the resultant fit and sample                                                                                                     
  TCanvas* c = new TCanvas("canvas","canvas",800,600) ;
  RooPlot* frame1 = x.frame(RooFit::Bins(30), RooFit::Title("Chebychev Fit on Low Statistic Sample"));
  set.plotOn(frame1, RooFit::Name("set"));
  chebychev_fit_extended.plotOn(frame1, RooFit::Name("fit"), RooFit::LineColor(kBlue));
  frame1->SetMinimum(-1.);
  frame1->Draw();
  TLegend *leg1 = new TLegend(0.65, 0.73, 0.86, 0.87);
  leg1->AddEntry("set", "sample", "P");
  leg1->AddEntry("fit", "fit"   , "L");
  leg1->Draw();
  c->SaveAs("fit_plot.pdf");
  // Setting  sub-regions                                                                                                                      
  x.setRange("range_before", 4900., 5100.);
  x.setRange("range_at"    , 5175., 5575.);
  x.setRange("range_after" , 5500., 5700.);
  x.setRange("range_full"  , 4766., 5966.);
  // Calculating pdf integrals in  subregions                                                                                                  
  RooAbsReal* integral_before = chebychev_fit_extended.createIntegral( RooArgSet(x), RooFit::NormSet(x), RooFit::Range("range_before"));
  RooAbsReal* integral_at     = chebychev_fit_extended.createIntegral( RooArgSet(x), RooFit::NormSet(x), RooFit::Range("range_at"    ));
  RooAbsReal* integral_after  = chebychev_fit_extended.createIntegral( RooArgSet(x), RooFit::NormSet(x), RooFit::Range("range_after" ));
  RooAbsReal* integral_full   = chebychev_fit_extended.createIntegral( RooArgSet(x), RooFit::NormSet(x), RooFit::Range("range_full"  ));
  double int_before = integral_before->getVal();
  double int_at     = integral_at->getVal();
  double int_after  = integral_after->getVal();
  double int_full   = integral_full->getVal();
  std::cout << "integral in range x[4900,5100] = " << int_before << std::endl;
  std::cout << "integral in range x[5175,5575] = " << int_at     << std::endl;
  std::cout << "integral in range x[5500,5700] = " << int_after  << std::endl;
  std::cout << "integral in range x[4766,5966] = " << int_full   << std::endl;
  x = 5600. ;
  std::cout << "at x = " << x << " , pdf = " << chebychev_fit.getVal(x) << std::endl;
}

Welcome to the ROOT forum

I think @moneta can help

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.