Failing chi2 fit

amarcine · September 7, 2023, 7:39pm

Hello,

Attached you can find a macro test.C (4.2 KB) producing pdf test.pdf (35.6 KB) showing the problem:
while the log likelihood fit works fine as expected, the chi2 fit fails even though it starts from the proper solution.

I tested this on ROOT 6.28.06.

Does anybody have any idea why this fails? Some problem with background normalization?

Cheers,
Antoni

mczurylo · September 8, 2023, 8:37am

Hi @amarcine,

thank you for your question. Maybe @jonas could take a look?

Cheers,
Marta

jonas · September 11, 2023, 9:32pm

Hello!

You are getting a nonsensical result for the chi2 because you are fitting a pdf that must be extended, but you didn’t pass the Extended() command arg. That’s because the coefficients of a RooAddPdf are degenerate without the additional constraint, as the RooAbsPdf always self-normalizes itself.

Adding Extended() to your chi2FItTo() will fix this problem. I also opened a PR to avoid this confusing behavior that different from fitTo(), chi2FitTo() will not automatically do an extended fit for extended pdfs:

github.com/root-project/root

[RF] Make extended fit default of RooAbsPdf::fitChi2() same as in fitTo

root-project:master ← guitargeek:chi2_extended

opened 09:28PM - 11 Sep 23 UTC

guitargeek

+46 -21

It is confusing to the users that the chi2 fit is by default not extended, even …if the pdf is extended. Because this is what happens in normal likelihood fits. This commit makes that behavior consistent and explains it in the release notes. Also, a new error message is added for the case you forcing a non extended fit on a pdf that must be extended, because this is guaranteed to give wrong results. This change addresses the confusion on the forum: https://root-forum.cern.ch/t/failing-chi2-fit/56309

That being said, I found another bug that can be reproduced with this modified version of your test:
test_const_opt_bug.C (4.7 KB)

The problem is that if you do a chi2 fit where one of the bins is zero, as in your case, the constant term optimization of the test statistics classes will do nonsense. You can fix the bad fit in my script by either deactivating the const term optimization, or by avoiding empty bins.

I have also opened a PR to fix this problem for the next ROOT release:

github.com/root-project/root

[RF] Fix skipping of zero weights in tests statistic caching

root-project:master ← guitargeek:skipZeroWeights

opened 11:28PM - 11 Sep 23 UTC

guitargeek

+20 -19

There was an evil hack in `RooAbsOptTestStatistic`: ```c++ _dataClone->cache…Args(this,_cachedNodes,_normSet,!_funcClone->getAttribute("BinnedLikelihood")) ; ``` The final parameter determins if zero weights should be skipped when recalculating the caches. Indeed, for the NLL case, this corresponds to the BinnedLikelihood attrribute being present or not, but it broke the other test statistics for which zero weights should not be skipped at all. This commit suggests a safer way to manage this with a new `RooAbOptTestStatistic::_skipZeroWeights` flag. In particular, this change fixed a bug that got uncovered during a forum discussion: https://root-forum.cern.ch/t/failing-chi2-fit/56309/3

Cheers,
Jonas

amarcine · September 12, 2023, 8:38am

Hi @jonas,

Thanks for this! I added a comment to your first PR.

Indeed adding Extended() kind of fixes the chi2FitTo(), but it doesn’t look very good (so I definitely confirm the “another bug” that you mention). Then changing 0 to 1 in the first bin (following comments in your macro) fixes it completely, similarly to turning off the constant term optimization in your macro.

I understand that initially you hoped that I would open a bug report for the second problem, but you ended up creating the PR yourself. Thank you very much! I want to make it clear that I do see the problem and that it is very relevant for me.

I would still like to understand how the above 2 problems affect RooChi2Var? I expect that not passing Extended() also causes problems, but what about zero weights? If RooChi2Var is affected by the above, will the second PR fix it?

When you say “for the next ROOT release”, do you mean v6-28-08 or v6-30? Since this is a bug, I hope it is the former case.

Cheers,
Antoni

jonas · September 12, 2023, 9:31am

Hi!

That’s great you’re also commenting on the PR. And yes, I first thought the problem was more complicated, so I would have appreciated an issue to stay aware of it and fix it later, but it turned out to be a relatively simple problem after all.

These 2 problems affect the RooChi2Var in exactly the same way as they affected the object returned by createChi2() in my script. But note that the test statistic classes like RooNLLVar or RooChi2Var should not be directly used in user code. They are the RooFit-internal implementation of whatever RooAbsReal the createNLL() and createChi2() methods return.

I mean both v6-28-08 or v6-30. But I’m not sure if there will be a v6-28-08 anytime soon. The v6-30-00 is around the corner (1-2 months), and usually patch releases on older release cycles are much rarer when a new cycle is started.

Also a question to you: I’m curious why you’re doing these chi2 fits? After all, they are only an approximation of a likelihood fit that is only correct in the limit of large statistics. And even then, you might as well do a likelihood fit, no?

Cheers,
Jonas

amarcine · September 12, 2023, 10:06am

Good to know v6-30 is coming soon!

Normally we do nll fits. Originally our nll fits were failing because we had a RooSimultaneous fit on high stats data without using Offset(). We didn’t understand what was happening, so we moved to chi2, but that failed in a different way… Once nll problem was solved I came back to chi2 because it bothered me that I don’t understand what is happening.

I would also like to get chi2 measure, but not from RooPlot (it is said to be less accurate than explicitly using RooChi2Var) and I suspected that whatever is breaking chi2 fit might also break that calculation.

I am confused with " test statistic classes like RooNLLVar or RooChi2Var should not be directly used in user code". That seems to be in contradiction with the note:
https://root.cern/doc/master/classRooPlot.html#ae2bbe62ce38dbf09bc894acbb117e68d

What is the difference between using RooChi2Var vs createChi2()? Furthermore, I need the reduced chi2 while RooChi2Var::getVal() seems to give the non reduced value. Do I have to get the number of bins and the number of free parameters and divide the value or is there some way to tell RooChi2Var to do it internally since it anyway has all the info at hand (the histogram and the pdf to read free parameters from)?

amarcine · September 12, 2023, 10:13am

That reminds me of another question: wouldn’t it be reasonable to make Offset() default for fitTo()? Can it break something? We got pretty confused along the way before we moved to the newer ROOT version which included some more-to-the-point messages with which we went through the forum and got pointed to the Offset().

jonas · September 12, 2023, 11:05am

I see, that makes sense. These very-high-stats fits really result in new numerical challenges in both RooFit and Minuit, so it’s a good idea to also try out the chi2 fit

I would also like to get chi2 measure, but not from RooPlot (it is said to be less accurate than explicitly using RooChi2Var) and I suspected that whatever is breaking chi2 fit might also break that calculation.

The reason for the RooPlot implementation being less accurate is different. It’s because the pdf is sampled at discrete points for the plot, resulting in information loss (the values at the histogram bin centers or integrals over bins that are required for the chi2 are now approximations that it gets from interpolation).

I am confused with " test statistic classes like RooNLLVar or RooChi2Var should not be directly used in user code". That seems to be in contradiction with the note: ROOT: RooPlot Class Reference

Thanks a lot for pointing this out! Then, the note is not up to date anymore. I will also change that in my first PR.

What is the difference between using RooChi2Var vs createChi2()?

The createChi2() supports a few more options than the “raw” RooChi2Var, because it is allowed to return an arbitrary RooAbsReal. For example, if you pass a comma-separated list of fit ranges, it will return a RooAddition of RooChi2Vars. The RooChi2Var itself does not support multi-range fits.

And in the future, if these chi2 fits would become performance bottlenecks, we might also introduce something like the BatchMode() option for createNLL(), which triggers a completely different code path and returning a different class.

Furthermore, I need the reduced chi2 while RooChi2Var::getVal() seems to give the non reduced value. Do I have to get the number of bins and the number of free parameters and divide the value or is there some way to tell RooChi2Var to do it internally since it anyway has all the info at hand (the histogram and the pdf to read free parameters from)?

I there is no option to get the reduced chi2 directly. In fact, I would be hesitant to introduce such an option, because then it invalidates the MINOS/HESSE uncertainty estimation with the +1.0 offset from the minimum (because people would inevitably use such reduced chi2 test statistics also for minimization). One would have to change the Minuit error level, and I’m not sure if this added complexity and potential confusion is worth the number of lines you save by not having to do val / (data.numEntries() - std::unique_ptr{getParameters(data)}.size())). So is it okay if we leave this out of RooFit for now?

That reminds me of another question: wouldn’t it be reasonable to make Offset() default for fitTo()? Can it break something? We got pretty confused along the way before we moved to the newer ROOT version which included some more-to-the-point messages with which we went through the forum and got pointed to the Offset().

Yes, this is something we’re considering in the future. It has not been done so far because of a) low priority (you might as well pass the option) and b) it breaks plenty of reference results in unit tests that we will have to update. But so I don’t completely forget about this, I have added this to the “Ideas for RooFit” thread:

github.com/root-project/root

[RF] Ideas for RooFit

opened 08:13PM - 05 Oct 20 UTC

hageboeck

improvement in:RooFit/RooStats

- [ ] Implement batch eval for Chi2 test stat - [x] Implement recovery from dis…allowed regions for batch eval (https://github.com/root-project/root/pull/6401) - [x] Implement getWeightBatch() and getBatches() for RooDataHist - [ ] Implement getBatch for RooTreeDataStore? - [ ] Don't clear all intermediate values in batch fits between fit cycles. Only the ones that changed. - [ ] Disable recalculateCache etc belonging to Lvl2 optimisation. - [ ] Use batch evaluation & inverted CDF for toys - [ ] Continue to improve interface with variadic templates - [ ] Profile and optimise new Batch interface - [ ] Modernise proxyList member of RooSimultaneous - [ ] Investigate if retrieving batch data with category states is better for batch evaluations. (vs. splitting composite datasets into components, and creating one NLL for each.) - [ ] Continue modernisation of RooSimultaneous. Requires rebasing and fixing an index bug in https://github.com/hageboeck/root/tree/updateRooSimultaneous - [ ] Implement analytical integration of RooJohnson. - [ ] Correct interface of RooAbsData and derived classes to use e.g. `std::size_t` for indexing events. `int` doesn't make sense. - [ ] Always have a debug version of RooFit around with `-DROOFIT_CHECK_CACHED_VALUES`. - [ ] Use analytic integrals in RooBinSamplingPdf when available. - [ ] Check that different integrator settings are honoured in RooBinSamplingPdf. - [ ] https://sft.its.cern.ch/jira/browse/ROOT-8304 - [ ] Implement evaluateSpan() in classes relevant for HistFactory fits. - [ ] Throw Gaussian & Poisson constraints into dedicated fast class. - [ ] Switch on FastEvaluations topic in RooFit message streams, and use it to trace down PDFs that don't implement the faster interface. - [ ] #7210, slowly augment PDFs with checks of the definition range of parameters. This prevents evaluation errors and can stabilise fits. - [ ] #7217, pythonisations for RooFit

amarcine · September 12, 2023, 12:21pm

Thanks for the explanation, I’ll use createChi2() from now on.

So is it okay if we leave this out of RooFit for now?

Sure, it is ok. I was just thinking that maybe I missed something given that RooPlot did offer reduced chi2. I didn’t grasp the problem with minimizing the reduced chi2.

I still have a question to:

std::unique_ptr{getParameters(data)}.size()

First there seem to be some typos → model.getParameters(), ->size(), but more importantly shouldn’t I iterate the list to count how many parameters are actually free if some of them are fixed?

Thanks regarding the possible future of the Offset(). I just tried adding it to the fitTo() in the macro I attached here, but now the nll fit fails… This example is cut out of my real fitting problem where there is one difference with the macro: instead of using the default 1D integrator setup, I am using RooAdaptiveGaussKronrodIntegrator1D with some specific parameters. In my real code the fit with Offset works flawlessly. In the macro with default integrator it fails. See the new macro
test_chi2.C (5.3 KB)
where you can uncomment the TuneIntegrator(); line to cure the fit. Do you understand why the fit fails with default integrator with Offset(), but works without Offset() + default integrator and with Offset() + different integrator?

system · September 26, 2023, 12:21pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.