I’ve been try to check my understanding of the RooNLLVar test statistic variable in RooFit. In particular, I thought that I would be able to ‘emulate’ the result of RooNLLVar with some code like this:
double nll(0);
for(int i=0;i<data.numEntries();i++) {
obs = *data.get(i); //sets the observables to the value of the ith entry
nll -= model.getLogVal(&obs); //evaluates log of pdf value
}
nll += model.extendedTerm(data.numEntries(),&obs); //for extended models
However, I tried doing this with a RooSimultaneous-based model I was working with, and I get a different result between this ‘by hand’ method and what I get back from a RooNLLVar object.
I put together a SWAN notebook to demonstrate this difference …
In case of RooSimultaneus pdf there are some extra constants applied, see for example
I think it is difficult to reproduce exactly the same result in term of absolute value. What you should check if that the difference in Delta Log L (for two different parameter values) are the same. The absolute value does not count. There can always be some constant therm stripped or added
I spotted these extra terms in the code too, and really I should been a bit more up front about why I was trying to understand this. I was developing a goodness-of-fit test using the Baker-Cousins likelihood ratio. If I compute this test statistic by hand, I get a chi2-distribution, but I was hoping I could keep my code clean and use RooNLLVar to compute the numerator. But if there are these extra terms, then I will lose my chi2 distribution.
Anyway, I modified my notebook to add this extra term and that ends up giving me agreement. So glad this is understood. Why did RooSimultaneous do this???