Fitting a TGraphAsymmErrors with the integral option

ccorti · October 25, 2013, 1:18am

Dear all,

I need to fit a binned flux with asymmetric errors: as far as I know TGraphAsymmErrors does not allow to fit with the I (integral) option, while TH1 does not support asymmetric errors.
I thought that maybe RooFit could help me, but the RooHist class uses the same fit as TGraphAsymmErrors…
Is there a way to do this?

Thanks.

Claudio Corti

ccorti · October 28, 2013, 8:25pm

Nobody ever faced this problem?

Thanks.

Claudio Corti

ccorti · November 23, 2013, 11:28am

I tried to come up with a solution, using a modified version of the function ROOT::Fit::FitObject.
Basically I changed the way the FillData function works for a TGraphAsymmErrors, using the bin low and up edge instead of the bin center, where the bin is defined by x-exl and x+exh.
See fill_data_graphasymm_bin below; the function fit_graphasymm_bin is basically the same code of ROOT::Fit::Fit, no change in the logic has been done.

void fill_data_graphasymm_bin(ROOT::Fit::BinData &dv, const TGraphAsymmErrors *gr, TF1 *func)
{
   UInt_t nPoints = gr->GetN();

   dv.Initialize(nPoints, 1, ROOT::Fit::BinData::kAsymError);

   Double_t *gx   = gr->GetX();
   Double_t *gy   = gr->GetY();
   Double_t *gxl  = gr->GetEXlow();
   Double_t *gxh  = gr->GetEXhigh();
   Double_t *gyl  = gr->GetEYlow();
   Double_t *gyh  = gr->GetEYhigh();

   const ROOT::Fit::DataRange &range = dv.Range();
   Double_t xmin = 0.;
   Double_t xmax = 0.;
   range.GetRange(xmin, xmax);

   Double_t xl[1];
   Double_t xh[1];

   for (UInt_t i = 0; i < nPoints; ++i)
   {
      xl[0] = gx[i] - gxl[i];
      xh[0] = gx[i] + gxh[i];

      if (xh[0] < xmin || xl[0] > xmax) continue;

      if (func != 0)
      {
         func->RejectPoint(false);
         (*func)(&gx[i]);
         if (func->RejectedPoint()) continue;
      }

      Double_t errorX = 0;

      dv.Add(xl[0], gy[i], errorX, gyl[i], gyh[i]);
      dv.AddBinUpEdge(xh);
   }
}

TFitResultPtr fit_graphasymm_bin(TGraphAsymmErrors *h1, TF1 *f1, Foption_t &fitOption, const ROOT::Math::MinimizerOptions &minOption, const char *goption, ROOT::Fit::DataRange &range)
{
   Int_t iret = 0;

   Int_t special = f1->GetNumber();
   Bool_t linear = f1->IsLinear();
   Int_t npar = f1->GetNpar();
   if (special == 299 + npar) linear = kTRUE; // for polynomial functions
   // do not use linear fitter in these case
   if (fitOption.Bound || fitOption.Like || fitOption.Errors || fitOption.Gradient || fitOption.More || fitOption.User|| fitOption.Integral || fitOption.Minuit) linear = kFALSE;

   // create the fitter
   std::auto_ptr<ROOT::Fit::Fitter> fitter(new ROOT::Fit::Fitter());
   ROOT::Fit::FitConfig &fitConfig = fitter->Config();

   // create options
   ROOT::Fit::DataOptions opt;
   opt.fIntegral = fitOption.Integral;
   opt.fUseRange = fitOption.Range;
   if (fitOption.Like) opt.fUseEmpty = true;  // use empty bins in log-likelihood fits
   if (special == 300) opt.fCoordErrors = false; // no need to use coordinate errors in a pol0 fit
   if (fitOption.NoErrX) opt.fCoordErrors = false;  // do not use coordinate errors when requested
   if (fitOption.W1) opt.fErrors1 = true;
   if (fitOption.W1 > 1) opt.fUseEmpty = true; // use empty bins with weight=1

   if (opt.fUseRange)
   {
      // check if function has range
      Double_t fxmin, fymin, fzmin, fxmax, fymax, fzmax;
      f1->GetRange(fxmin, fymin, fzmin, fxmax, fymax, fzmax);
      // support only one range - so add only if was not set before
      if (range.Size(0) == 0) range.AddRange(0, fxmin, fxmax);
      if (range.Size(1) == 0) range.AddRange(1, fymin, fymax);
      if (range.Size(2) == 0) range.AddRange(2, fzmin, fzmax);
   }

   // fill data
   std::auto_ptr<ROOT::Fit::BinData> fitdata(new ROOT::Fit::BinData(opt, range));
   fill_data_graphasymm_bin(*fitdata, h1, f1);
   if (fitdata->Size() == 0)
   {
      return -1;
   }

   // switch off linear fitting in case data has  coordinate errors
   if (fitdata->GetErrorType() == ROOT::Fit::BinData::kCoordError) linear = false;

   // this functions use the TVirtualFitter
   if (special != 0 && !fitOption.Bound && !linear)
   {
      if (special == 100) ROOT::Fit::InitGaus(*fitdata, f1); // gaussian
      else if (special == 110) ROOT::Fit::Init2DGaus(*fitdata, f1); // 2D gaussian
      else if (special == 400) ROOT::Fit::InitGaus(*fitdata, f1); // landau (use the same)
      else if (special == 410) ROOT::Fit::Init2DGaus(*fitdata, f1); // 2D landau (use the same)
      else if (special == 200) ROOT::Fit::InitExpo(*fitdata, f1); // exponential
   }

   // set the fit function
   // if option grad is specified use gradient
   if ((linear || fitOption.Gradient)) fitter->SetFunction(ROOT::Math::WrappedMultiTF1(*f1));
   else fitter->SetFunction(static_cast<const ROOT::Math::IParamMultiFunction &>(ROOT::Math::WrappedMultiTF1(*f1)));

   // parameter settings and transfer the parameters values, names and limits from the functions
   // is done automatically in the Fitter.cxx
   for (int i = 0; i < npar; ++i)
   {
      ROOT::Fit::ParameterSettings &parSettings = fitConfig.ParSettings(i);

      // check limits
      double plow, pup;
      f1->GetParLimits(i, plow, pup);
      if (plow*pup != 0 && plow >= pup)
      {
         // this is a limitation - cannot fix a parameter to zero value
         parSettings.Fix();
      }
      else if (plow < pup)
      {
         parSettings.SetLimits(plow, pup);
      }

      // set the parameter step size (by default are set to 0.3 of value)
      // if function provides meaningful error values
      double err = f1->GetParError(i);
      if (err > 0) parSettings.SetStepSize(err);
      else if (plow < pup)
      {
         // in case of limits improve step sizes
         double step = 0.1 * (pup - plow);
         // check if value is not too close to limit otherwise trim value
         if (parSettings.Value() < pup && pup - parSettings.Value() < 2*step) step = (pup - parSettings.Value())/2;
         else if (parSettings.Value() > plow && parSettings.Value() - plow < 2*step) step = (parSettings.Value() - plow)/2;

         parSettings.SetStepSize(step);
      }
   }

   // set all default minimizer options (tolerance, max iterations, etc..)
   fitConfig.SetMinimizerOptions(minOption);

   // specific  print level options
   if (fitOption.Verbose) fitConfig.MinimizerOptions().SetPrintLevel(3);
   if (fitOption.Quiet) fitConfig.MinimizerOptions().SetPrintLevel(0);

   // specific minimizer options depending on minimizer
   if (linear)
   {
      if (fitOption.Robust)
      {
         // robust fitting
         std::string type = "Robust";
         // if an h is specified print out the value adding to the type
         if (fitOption.hRobust > 0 && fitOption.hRobust < 1.) type += " (h=" + ROOT::Math::Util::ToString(fitOption.hRobust) + ")";
         fitConfig.SetMinimizer("Linear", type.c_str());
         fitConfig.MinimizerOptions().SetTolerance(fitOption.hRobust); // use tolerance for passing robust parameter
      }
      else fitConfig.SetMinimizer("Linear", "");
   }
   else
   {
      if (fitOption.More) fitConfig.SetMinimizer("Minuit", "MigradImproved");
   }

   // check if Error option (run Hesse and Minos) then
   if (fitOption.Errors)
   {
      // run Hesse and Minos
      fitConfig.SetParabErrors(true);
      fitConfig.SetMinosErrors(true);
   }

   // do fitting
   bool fitok = false;

   // check if can use option user
   //typedef  void (* MinuitFCN_t )(int &npar, double *gin, double &f, double *u, int flag);
   TVirtualFitter::FCNFunc_t userFcn = 0;
   if (fitOption.User && TVirtualFitter::GetFitter())
   {
      userFcn = (TVirtualFitter::GetFitter())->GetFCN();
      (TVirtualFitter::GetFitter())->SetUserFunc(f1);
   }

   // user provided fit objective function
   if (fitOption.User && userFcn) fitok = fitter->FitFCN( userFcn );
   else if (fitOption.Like)
   {
      // likelihood fit
      // perform a weighted likelihood fit by applying weight correction to errors
      bool weight = ((fitOption.Like & 2) == 2);
      fitConfig.SetWeightCorrection(weight);
      bool extended = ((fitOption.Like & 4) != 4);
      fitok = fitter->LikelihoodFit(*fitdata, extended);
   }
   // standard least square fit
   else fitok = fitter->Fit(*fitdata);

   iret |= !fitok;

   const ROOT::Fit::FitResult & fitResult = fitter->Result();
   // one could set directly the fit result in TF1
   iret = fitResult.Status();
   if (!fitResult.IsEmpty())
   {
      // set in f1 the result of the fit
      f1->SetChisquare(fitResult.Chi2());
      f1->SetNDF(fitResult.Ndf());
      f1->SetNumberFitPoints(fitdata->Size());

      f1->SetParameters(&(fitResult.Parameters().front()));
      if (int(fitResult.Errors().size()) >= f1->GetNpar()) f1->SetParErrors(&(fitResult.Errors().front()));
   }

   // print the result
   // if using Fitter class must be done here
   // use old style Minuit for TMinuit and if no corrections have been applied
   if (!fitOption.Quiet)
   {
      if (fitter->GetMinimizer() && fitConfig.MinimizerType() == "Minuit" && !fitConfig.NormalizeErrors() && fitOption.Like <= 1)
      {
         fitter->GetMinimizer()->PrintResults(); // use old style Minuit
      }
      else
      {
         // print using FitResult class
         if (fitOption.Verbose) fitResult.PrintCovMatrix(std::cout);
         fitResult.Print(std::cout);
      }
   }

   // store result in the backward compatible VirtualFitter
   TVirtualFitter * lastFitter = TVirtualFitter::GetFitter();
   // pass ownership of Fitter and Fitdata to TBackCompFitter (fitter pointer cannot be used afterwards)
   // need to get the raw pointer due to the  missing template copy ctor of auto_ptr on solaris
   // reset fitdata(cannot use anymore , ownership is passed)
   TBackCompFitter *bcfitter = new TBackCompFitter(fitter, std::auto_ptr<ROOT::Fit::FitData>(fitdata.release()));
   bcfitter->SetFitOption(fitOption);
   bcfitter->SetObjectFit(h1);
   bcfitter->SetUserFunc(f1);
   bcfitter->SetBit(TBackCompFitter::kCanDeleteLast);
   if (userFcn)
   {
      bcfitter->SetFCN(userFcn);
      // for interpreted FCN functions
      if (lastFitter->GetMethodCall()) bcfitter->SetMethodCall(lastFitter->GetMethodCall());
   }

   // delete last fitter if it has been created here before
   if (lastFitter)
   {
      TBackCompFitter *lastBCFitter = dynamic_cast<TBackCompFitter *> (lastFitter);
      if (lastBCFitter && lastBCFitter->TestBit(TBackCompFitter::kCanDeleteLast)) delete lastBCFitter;
   }
   //N.B=  this might create a memory leak if user does not delete the fitter he creates
   TVirtualFitter::SetFitter(bcfitter);

   if (fitOption.StoreResult)
   {
      TFitResult *fr = new TFitResult(fitResult);
      TString name = "TFitResult-";
      name = name + h1->GetName() + "-" + f1->GetName();
      TString title = "TFitResult-";
      title += h1->GetTitle();
      fr->SetName(name);
      fr->SetTitle(title);
      return TFitResultPtr(fr);
   }
   else return TFitResultPtr(iret);
}

To fit a TGraphAsymmErrors, I do:

Foption_t fopt;
ROOT::Fit::FitOptionsMake("Q0NXISTER", fopt);
ROOT::Fit::DataRange range(enemin, enemax);
ROOT::Math::MinimizerOptions mopt;
mopt.SetMinimizerAlgorithm("Migrad");
TFitResultPtr frp_data = fit_graphasymm_bin(gae_data, f_datafit, fopt, mopt, "", range);

The problem is that, even if I set the option Integral, the fit is done by minimizing the difference between the graph value and the function evaluated at the low edge of the bin.
If in fill_data_graphasymm_bin I change the lines

dv.Initialize(nPoints, 1, ROOT::Fit::BinData::kAsymError)
dv.Add(xl[0], gy[i], errorX, gyl[i], gyh[i]);

with

dv.Initialize(nPoints, 1, ROOT::Fit::BinData::kValueError)
dv.Add(xl[0], gy[i], (gyl[i] + gyh[i])/2);

then the fit is correctly performed by minimizing the difference between the graph value and the integral of the function in the bin.
So, it seems that if the data have asymmetric errors, the fit is never done with the integral, no matter what you do.
Is it possible to circumvent this limitation?

moneta · June 23, 2014, 1:42pm

Hi,

It does not make sense to fit a TGraph with the bin integral option since there is no concept of bin in a TGraph.
What you might want to have is a least square fit of an histogram with asymmetric errors.
This is not implemented, but in general an histogram should be fitted using the likelihood method. You should know what is the statistical distributions of your bin content and then build a likelihood function from that.
Which kind of data re you trying to fit ?

Anyway, if really needed, we could the possibility to store asymmetric errors in Bindata (which can be used vy the histogram) and one this is done, it is straight forward to support fits with asymmetric errors).

At the moment, the only way to do it is to re-implement yourself the evaluation of the chi2 function. This is done here in case of histograms (can use bin integral but not asymmetric errors)

project-mathlibs.web.cern.ch/pro … 890b0d627c

and here in case of TGraph (can use asymmetric errors but not bin integral)

project-mathlibs.web.cern.ch/pro … 47789ccdab

Best Regards

Lorenzo

ccorti · June 23, 2014, 1:58pm

Hi Lorenzo,
thanks for your answer.

[quote]It does not make sense to fit a TGraph with the bin integral option since there is no concept of bin in a TGraph.
What you might want to have is a least square fit of an histogram with asymmetric errors. [/quote]
Indeed, I was trying to use the integral fit with the TGraph only because histograms don’t support asymmetric errors…

Cosmic rays fluxes taken from different articles, so the only informations I have are the errors provided in the papers and sometimes they are asymmetric.

This would be very useful!

moneta · June 23, 2014, 2:08pm

Hi,

Have you checked if these are Poisson errors (i.e. Garwood intervals), as we are computing now in ROOT ?
Because in this case just do a simple likelihood fit (option “L”) of the histogram. It is not correct to fit those error with a least-square

Lorenzo

ccorti · June 23, 2014, 2:16pm

Well, I didn’t check all the articles, but since the likelihood of the systematic errors is never available in the papers, I doubt I can safely assume it is Poissonian.