Itting to large Datasets using extended PDFs

wlockman · July 28, 2010, 6:01pm

We are having some trouble minimizing -2logL, an extended log likelihood statistic applied to a large dataset. I suspect part of the problem may be that the size of -2log L near the minimum is very large. When HESSE is called, we often get non-positive definite covariance matrix because of floating point rounding issues (minuit is looking for small changes in a very large number).

We are trying to implement our own version of RooNLLVar in which we subtract an offset, -2logL0/N, the average value of -2logL per event, evaluated on the first to minuit, to each event (L0 is the likelihood on the first event and N is the number of events being fitted). This is an attempt to reduce floating point round off errors in minuit. To date, we have not yet succeeded in getting this to work. The problem we are having is finding a way to save and retrieve the offset. Our first attempt at this was to use a standard singleton class, but this failed, even when we ran on a single CPU.

Has anyone succeeded in implementing an extension to RooNLLVar to include a constant offset to -2LogL?

Thanks,

Bill Lockman

Wouter_Verkerke · July 28, 2010, 6:10pm

Hi Bill,

You don’t need to write an extension, you can simply do this

RooAbsReal* nll = pdf.createNll(data,<same options as fitTo()>) ;

The you subtract something as follows

RooFormulaVar nll2(“nll2”,"@0-SomeLargeNumer",*nll) ;

Which you can then minimize as follows

RooMinuit m(nll2) ;
m.migrad() ;
m.hesse() ;

Note that quite often these ‘negative second derivative’ occurrences are due to other causes
(such as roundoff errors on numeric normalization integrals).

Wouter