Threadsafe likelihood-fit on unequally sized bins

philoktetes · April 18, 2013, 4:15pm

Hey folks,

I want to do threadsafe likelihood-fitting to histograms with unequally sized bins, using integration over the fit-function.

The threadsafeness is given with the exclusion of TH1, but the fit routine itself still does have its quirks. I adapted a TH1-free approach using the ROOT::Minuit2::FCNBase class. It’s working principally, but I get strange “not-a-number” results after a few iterations.

Can anybody give me a hint, what makes TH1::Fit more “stable” than my approach?

Many thanks,

Stephan

My fit-function is a simple gaussian:

double gauss_gsl(double x, void *params){
  double *p = (double *)params;
  double gauss = p[0]*exp(-0.5*(x-p[1])*(x-p[1])/(p[2]*p[2]))/(sqrt(2.*M_PI)*p[2]);
  return gauss;
  }

The FCNBase-class:

class HistoLLFcn : public ROOT::Minuit2::FCNBase {

public:

  HistoLLFcn(const std::vector<int>& bins, const std::vector<double>& cnts, const std::vector<double>& boun) :
	theBins(bins), theCounts(cnts),theBoundaries(boun), theErrorDef(1.) {}

  ~HistoLLFcn() {}

  virtual double Up() const {return theErrorDef;}
  
   //where the work is done
   double operator()(const std::vector<double>& par) const {
   double logl = 0;
 
   //passing the parameter vector to a constant array, that gsl can handle
   double params_gsl[3];
   params_gsl[0] = par[0];
   params_gsl[1] = par[1];
   params_gsl[2] = par[2];

   //invoking gsl function and passing parameters
   gsl_function func_gsl;
   func_gsl.function = &gauss_gsl;
   func_gsl.params = params_gsl;

   //generating integration workspace
   gsl_integration_workspace *w = gsl_integration_workspace_alloc(1e3);
   
   //no gsl_error codes
   gsl_set_error_handler_off();

   //here is the math
   double result, error;
   for(int i=0; i<theBins.size(); i++){

    gsl_integration_qag(&func_gsl,theBoundaries.at(i),theBoundaries.at(i+1),1e-9,1e-6,1e3,4,w,&result,&error);
    logl += result-std::log(result)*theCounts.at(i);
    }

   //free integration workspace
   gsl_integration_workspace_free(w);

   return logl;
   }
 
  std::vector<int> bins() const {return theBins;}
  std::vector<double> counts() const {return theCounts;}
  std::vector<double> boundaries() const {return theBoundaries;}

  void setErrorDef(double def) {theErrorDef = def;}

private:

  std::vector<int> theBins;
  std::vector<double> theCounts;
  std::vector<double> theBoundaries;
  double theErrorDef;

};

The “main”-code:

HistoLLFcn theFCN(bins,cnts,boun); //contains boundaries and weighted entries

//starting values, calculated elsewhere
ROOT::Minuit2::MnUserParameters upar;
upar.Add("cnst",CNST,0.1);
upar.Add("mean",MEAN,0.1);
upar.Add("sigma",RMS,0.1);

//create minimizer (default constructor)
ROOT::Minuit2::MnMinimize migrad(theFCN,upar);

//Minimize
migrad.Fix("cnst");
ROOT::Minuit2::FunctionMinimum min = migrad();
std::cout << min << std::endl;

philoktetes · April 24, 2013, 2:38pm

Hello folks,

I still haven’t figured out, what I’m doing wrong. In concurrence with [url=https://root-forum.cern.ch/t/gslerror-nan-and-minuit/7789/3 thread I checked my fit-function and my integration. After some studies, I must say that I trust this part of my implementation.

Below I show a graph where I calculate the integral over the mentioned normalized gaussian with the mentioned gsl-qag algorithm.

The plot shows the area underneath the gaussian for the bin, where the function is centered in (black), the adjacent bin (red) and the whole histogram (green). During the 1e5 calculated steps the sigma of the gaussian is changed from “1” to “1e-4”, which is rather small. The calculation is running threaded, with OpenMP, and delivers no GSL_error at all.

Does anybody have an idea, what option makes the invocation of TMinuit in TH1::Fit more stable?