Is it possible to use the AsymptoticError method in RooAbsPdf::fitTo without parameter limits?

thucking · June 19, 2020, 9:57am

Dear Rooters,

I wanted to use the AsymptoticError flag in RooAbsPdf::fitTo without parameter limits, as Minuit transforms the parameters. The Minuit Manual states:

Because this transformation is non-linear, it is recommended
to avoid putting limits on parameters where they are not needed.

Parameter limits can lead to wrongly estimated errors and increased additional numerical inaccuracies (s. sec 1.3.1 and 5.3.2 in the manual linked above)

The problem can reproduced with the example provided by Christoph Langenbruch simply by dropping the range limits.
Here is the slightly modified version: rf611_weightedfits_AsymErrRng.C (9.8 KB)
Run with .x rf611_weightedfits_AsymErrRng.C(2, kFALSE) in the Root prompt. The second parameter is disabling the range.

When running without parameter limits one gets tons of warnings like:

[#0] ERROR:Eval -- RooAbsReal::logEvalError(pol) evaluation error, 
 origin       : RooPolynomial::pol[ x=costheta coefList=(c0,c1) ]
 message      : p.d.f value is less than zero (-773502879776060572245688320.000000), forcing value to zero
 server values: x=costheta=0.386751, coefList=(c0 = -2e+27 +/- 0.0183967,c1 = 0.0528082 +/- 0.036377)

Note the crazy value of -2e+27 for c0.

As I see it, this does not happen in the fit, but somehow in the calculation of the corrected error matrix.
So setting a small step size for Minuit would not help, as the parameters are not running away in the fit.
I am fearing that the parameter limits could lead to wrongly estimated errors by Minuit. Therefore also the asymptotically correct errors would be wrong, as they got a wrong input.

Is there a good reason why limits have to set for parameters when using AsymptoticError flag?

Setup

Root 6.20.04 compiled with gcc 7.5
Platform: Ubuntu 18.04

Appendix

I can not find any reason in the code I consider relevant. For convenience I quote the lines 1695-1746 from RooAbsPdf.cxx of tag v6-20-04:

      if (doAsymptotic==1 && m.getNPar()>0) {
	//Calculated corrected errors for weighted likelihood fits
	std::unique_ptr<RooFitResult> rw(m.save());
	//Weighted inverse Hessian matrix
	const TMatrixDSym& matV = rw->covarianceMatrix();
	coutI(Fitting) << "RooAbsPdf::fitTo(" << GetName() << ") Calculating covariance matrix according to the asymptotically correct approach. If you find this method useful please consider citing https://arxiv.org/abs/1911.01303." << endl;

	//Initialise matrix containing first derivatives
	TMatrixDSym num(rw->floatParsFinal().getSize());
	for (int k=0; k<rw->floatParsFinal().getSize(); k++)
	   for (int l=0; l<rw->floatParsFinal().getSize(); l++)
	      num(k,l) = 0.0;
	RooArgSet* obs = getObservables(data);      
	//Create derivative objects
	std::vector<std::unique_ptr<RooDerivative> > derivatives;
	const RooArgList& floated = rw->floatParsFinal();
	std::unique_ptr<RooArgSet> floatingparams( (RooArgSet*)getParameters(data)->selectByAttrib("Constant", false) );
	for (int k=0; k<floated.getSize(); k++) {	   
	   RooRealVar* paramresult = (RooRealVar*)floated.at(k);
	   RooRealVar* paraminternal = (RooRealVar*)floatingparams->find(paramresult->getTitle());
	   std::unique_ptr<RooDerivative> deriv( derivative(*paraminternal, *obs, 1) );
	   derivatives.push_back(std::move(deriv));
	}
	
	//Loop over data
	for (int j=0; j<data.numEntries(); j++) {
	   //Sets obs to current data point, this is where the pdf will be evaluated
	   *obs = *data.get(j);
	   //Determine first derivatives
	   std::vector<Double_t> diffs(floated.getSize(), 0.0);
	   for (int k=0; k<floated.getSize(); k++) {
	      RooRealVar* paramresult = (RooRealVar*)floated.at(k);
	      RooRealVar* paraminternal = (RooRealVar*)floatingparams->find(paramresult->getTitle());
	      //First derivative to parameter k at best estimate point for this measurement
	      Double_t diff = derivatives.at(k)->getVal();
	      //Need to reset to best fit point after differentiation
	      *paraminternal = paramresult->getVal();
	      diffs.at(k) = diff;
	   }
	   //Fill numerator matrix
	   Double_t prob = getVal(obs);
	   for (int k=0; k<floated.getSize(); k++) {
	      for (int l=0; l<floated.getSize(); l++) {
	         num(k,l) += data.weight()*data.weight()*diffs.at(k)*diffs.at(l)/(prob*prob);
	      }
	   }
	}	
	num.Similarity(matV);

	//Propagate corrected errors to parameters objects
	m.applyCovarianceMatrix(num);
      }

StephanH · June 19, 2020, 10:26am

Hi @thucking,

you can surely leave parameters unconstrained, let Minuit estimate the covariance matrix, and then correct it using the asymptotically-correct approach. However, if the fit model is unstable (large correlations / likes to diverge in undefined regions), limiting the ranges (or constraining the parameters) is inevitable.

What you quote (the polynomial) is notoriously unstable, as polynomials like to go negative, which PDFs cannot do. If you replaced this by a “nicer” PDF, or constrained the parameters such that the polynomial stays positive over the entire fit range, you can run this without limiting the parameters.

Maybe the RooChebychev is an option, since it’s always positive
or you use constraints:
https://root.cern.ch/doc/master/rf604__constraints_8C.html
If you make them weak enough (even zero) as long as the polynomial is way positive, they won’t affect the errors. If, however, the constraint has to work to keep the polynomial positive, since the polynomial is close to zero, you will see it in the errors.

thucking · June 19, 2020, 10:34am

Hi @StephanH,

I think that the fit is going fine. The error messages appear after the fit is done. Expecially after

[#1] INFO:Fitting -- RooAbsPdf::fitTo(pol) Calculating covariance matrix according to the asymptotically correct approach. If you find this method useful please consider citing https://arxiv.org/abs/1911.01303.

has been printed. At this point the fitting is done. So the problem seems to be where the derivatives are calculated.

Originally I found the problem when using a Gaussian. So I think this is not only a problem of badly choosen PDF, at least when considering a Gaussian to be a “nice” PDF
If necessary I could also proivde a Macro for a Gaussian showing the same problem. But I thought RooFit tutorials are more trust worth to not have any other fancy bug

StephanH · June 19, 2020, 11:05am

Ok, @moneta is working on the derivatives, anyway. Maybe he can check out your problem.

This is now:
https://sft.its.cern.ch/jira/browse/ROOT-10866

thucking · June 19, 2020, 11:53am

Hi @StephanH,

thanks a lot for the fast response.

moneta · June 22, 2020, 9:05am

Hi,
Can you post the problem you found when also using a Gaussian pdf ?
Thank you
Lorenzo

thucking · June 22, 2020, 9:14am

Hi,

yes of course. I’ll prepare a reduced macro and post it soon.

thucking · June 22, 2020, 10:07am

Hi,

I have the Example ready. A little bit about it:

2 Variables:
- Discriminating variable mass (Exponential Background, Gaussian Signal)
- Control variable (Gaussian Background, Gaussian Signal)
The Configurator class holds the Parameter values. The model is then build via the ModelBuilder class. But only the main macro CheckRangeDependAsymErr.C should be of interest.
In the Main macro
1. A data set is generated
2. A fit do the discriminating variable is done
3. sWeights are calculated and a weighted data set is produced
4. Fits to weighted data are performed
The main Macro has a flag if ranges should be set or not. If you just call without any parameter you should get the warnings:

[#0] ERROR:Eval -- RooAbsReal::logEvalError(pdfVggSig) evaluation error, 
 origin       : RooGaussian::pdfVggSig[ x=vgg mean=vggSigMeanFit sigma=vggSigStdDevFit ]
 message      : p.d.f normalization integral is zero or negative: 0
 server values: x=vgg=105.758, mean=vggSigMeanFit=119.803 +/- 0.0382603, sigma=vggSigStdDevFit=2e+27 +/- 0.0271769

Calling it as root -q "CheckRangeDependAsymErr.C(kTRUE)" should produce no error message. In this case 3 fits are perforemd.
1. Only on parameter limited
2. Both parameters limited
3. Both parameter limited with different choice of limits

Here are the files:
CheckRangeDependAsymErr.C (4.0 KB) Config.C (3.9 KB) modelBuilder.C (13.4 KB)

Further Questions?

If you want I can produce the same problem with an Exponential pdf

Thanks a lot for your efforts

Cheers
Tim

moneta · June 22, 2020, 10:44am

Hello Tim,
Thank you very much for your example. It is very useful. I am looking into this problem and the other one with the extended fit when using the new AsymptoticError method

Cheers

Lorenzo

system · July 6, 2020, 10:44am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.