RooFit const optimization leading to bad fit result?

will_cern · March 8, 2017, 7:24pm

Hello,

I have been developing a new pdf class for RooFit, but in testing have found it gives me unexpected behaviour when I use the default fitting procedure. Below I provide a cut-down version of the class I am developing. In this form, the pdf is configured with a list of ‘observables’ (that actually do nothing here), a number (double), a list of ‘terms’ that are RooAbsReal that get added to the number, and a list of ‘factors’ that multiply the result. I.e. the value of the pdf is: Prod(factors)*(val + Sum(terms))

The pdf also behaves as an extended pdf, and it also uses the RooBinIntegrator for its integration (I’ve simplified the binning setup for this example).

Here’s the code:

#ifndef ROODEMOPDF
#define ROODEMOPDF

#include "RooAbsPdf.h"
#include "RooListProxy.h"
#include "RooNumIntConfig.h"
 
class RooDemoPdf : public RooAbsPdf {
public:
  RooDemoPdf() {} ; 
  
  RooDemoPdf(const char* name, const char* title, 
    const RooArgList& obs, const RooArgList& factors, const RooArgList& terms, double val) :
      RooAbsPdf(name,title), 
   fObs("obs","obs",this),
   fFactors("factors","factors",this),
   fTerms("terms","terms",this),
   fVal(val) 
  { 
  fObs.add(obs);
  fFactors.add(factors);
  fTerms.add(terms);
  specialIntegratorConfig(kTRUE)->method1D().setLabel("RooBinIntegrator");
  }
  
 
  RooDemoPdf(const RooDemoPdf& other, const char* name=0) :
  RooAbsPdf(other,name), 
   fObs("obs",this,other.fObs),
   fFactors("factors",this,other.fFactors),
   fTerms("terms",this,other.fTerms),
   fVal(other.fVal) {}
   
  virtual TObject* clone(const char* newname) const { return new RooDemoPdf(*this,newname); }
  inline virtual ~RooDemoPdf() { }

  virtual ExtendMode extendMode() const { return CanBeExtended ; }
  virtual Double_t expectedEvents(const RooArgSet* nset=0) const { if(nset==0) return getVal(); return getNorm(nset); }
  virtual Double_t expectedEvents(const RooArgSet& nset) const { return expectedEvents(&nset) ; }


  virtual std::list<Double_t>* binBoundaries(RooAbsRealLValue& obs, Double_t xlow, Double_t xhi) const {
    return new std::list<Double_t>({0,1});
  }
  virtual Bool_t isBinnedDistribution(const RooArgSet& obs) const { return true; }

protected:

  RooListProxy fObs; 
  RooListProxy fFactors; 
  RooListProxy fTerms; 
  double fVal = 0;
  
  Double_t evaluate() const {
    
    double out = fVal;
    
    RooFIter itr0(fTerms.fwdIterator());
    while(auto arg = itr0.next()) out += static_cast<RooAbsReal*>(arg)->getVal();
    
    RooFIter itr(fFactors.fwdIterator());
    while(auto arg = itr.next()) out *= static_cast<RooAbsReal*>(arg)->getVal();
    
    return out;
  }


private:

  ClassDef(RooDemoPdf,1) 
};
 
#endif

I want to use my class to construct a pdf that describes an expectation of mu*s + b events, where s=1 and b=3.2 , mu is the parameter to be fit. Here’s the example:

  RooRealVar x("x","x",0.5,0,1);
  RooRealVar mu("mu","mu",1,0,15);

  //define a bkg of 3.2 events
  RooDemoPdf bkg("bkg","bkg",x,RooArgList(),RooArgList(),3.2);
  //define a signal component that equals mu
  RooDemoPdf sig("sig","sig",x,mu,RooArgList(),1);
  //define a combined component
  RooDemoPdf mean("mean","mean",x,RooArgList(),RooArgList(bkg,sig),0);

This has the desired behaviour with respect to the value of the pdf (it equals 4.2 when mu=1, for example) and the normalized value of the pdf (it is 1, i.e. a flat pdf, wrt x). But when I create a dataset of 6 events and fit my model to that data, I do not get mu=2.8 as the result, instead I get mu=5??

  //dataset will be 6 events
  RooDataSet data("data","data",RooArgSet(x));
  for(int i=0;i<6;i++) data.add(RooArgSet(x)); 
 
  //fit to the data .. should give mu = 2.8 .. BUT DOESNT!??
  mean.fitTo(data);

If I plot the negative log likelihood, it is minimized around 2.8 as I would expect:

  nll = mean.createNLL(data);
  fr = mu.frame();
  nll->plotOn(fr);
  fr->Draw();

I then discovered if I disabled the const optimization in the fit, the fit works:

mean.fitTo(data,RooFit::Optimize(false)); //this gives a mu of 2.8

My suspicision then is that when the const optimization decides that the “bkg” term can be evaluated once and cached, it gets evaluated once with value “1” (the uniform pdf value) but then that screws up the “mean” pdf, which now is evaluating “mu + 1” rather than “mu + 3.2”

So my question is: What is it about my pdf that makes it problematic for const optimization? Why did the optimization decide it was const, and how can I alter my pdf evaluate code so that the mean pdf would continue to evaluate properly even if the bkg pdf was cached cached.

For your information, here’s the logging output from the optimization messaging when I run with the default const optimize turned on:

[#1] INFO:Minization -- p.d.f. provides expected number of events, including extended term in likelihood.
[#2] INFO:Optimization -- RooAbsOptTestStatistic::ctor(nll_mean_data) optimizing internal clone of p.d.f for likelihood evaluation.Lazy evaluation and associated change tracking will disabled for all nodes that depend on observables
[#1] INFO:NumericIntegration -- RooRealIntegral::init(mean_Int[x]) using numeric integrator RooBinIntegrator to calculate Int(x)
[#2] INFO:Optimization -- RooAbsArg::optimizeCacheMode(mean) nodes (mean,bkg,sig) depend on observables, changing cache operation mode from change tracking to unconditional evaluation
[#1] INFO:Minization -- RooMinuit::optimizeConst: activating const optimization
[#2] INFO:Optimization -- RooAbsOptTestStatistic::constOptimize(nll_mean_data) optimizing evaluation of test statistic by finding all nodes in p.d.f that depend exclusively on observables and constant parameters and precalculating their values
[#2] INFO:Optimization -- RooAbsArg::findConstantNodes(mean): components (bkg) depend exclusively on constant parameters and will be precalculated and cached
[#1] INFO:NumericIntegration -- RooRealIntegral::init(bkg_Int[x]) using numeric integrator RooBinIntegrator to calculate Int(x)

Any help with this problem would be greatly appreciated. Let me know if you need anything else from me too.

Thanks
Will