In the attached example, you’ll find a model consisting of double gaussian (RooAddPdf) in variable “x” and gaussian-convolved exponential*erf (RooFFTConvPdf) in variable “y”. There is no cross-talk between x- and y- dependent components, moreover, parameters of the x-dependent double gaussian are fixed. I form a RooProdPdf from the x- and y- dependent components, and generate a toy sample in (x, y). Then I fit it with (1) only the y-dependent RooFFTConvPdf and get sensible result. Next, I restore the parameters and fit the same toy with (2) the product of RooAddPdf and RooFFTConvPdf. This leads to failure. In third step, I form RooProdPdf from only one gaussian (RooGaussian) and the RooFFTConvPdf, generate another toy and fit it with this product (3), which also seems to work fine.
I’m therefore wondering, if there could still be some hostility between RooAddPdf and RooFFTConvPdf when combined in RooProdPdf? I’ve tried this in various versions of root including newest 6.26, so any fixes introduced after the JIRA ticket linked above should be included. Or maybe I’m blaming wrong suspect, and the problem is somewhere else?
On a separate note: in the last fitting step, I try to remove RooFit::Range(“fullRange”) from the fitTo call in the 1-D fit performed in step (1), and the fit still converges but to very different result than (1), even though “fullRange” simply covers the entire range of variables x and y. Could you please explain why that happens? I would expect identical result in this case. example_RooFFTConvPdf_product.cxx (3.1 KB)
Hi! I see several problems with your script, maybe fixing them resolved your problems:
Your RooAddPdf is undefined, as you are using the constructor without coefficients. This constructor skips all non-extendible PDFs, which the Gaussians are, so your RooAddPdf has no summands and that’s why I’m not too surprised there are problems. You already have a coefficient variable defined in the previous line, why not use it to make the PDF valid again:
I think you have an accidental double-negative in your exponential slope parameter. Your formula goes TMath::Exp(-@0/@3) with a minus sign, so the parameter has to be positive. However, when you reset the fit parameters, you set it to a negative value. Probably the fit has a hard time figuring out the sign is wrong, hence the bad convergence.
You should restrict the ranges of your parameters more. For example, your sigma parameters should only be defined in the >= 0.0 regime.
Your observable range for y is too restrictive. Your exp_erf_raw has mean 0.12 and sigma 1.2, so it is nonzero also below zero which is outside your observable definition range of 0 to 12. This is not good for the convolution, because it needs to integrate over the full PDF support range which doesn’t work if it extends beyond the observable range defined in RooFit. I would try to set the y range to -4 to 12. Probably this fixes also the problem you get when fixing the range as a fit parameter, because then the RooFFTConvPDF doesn’t have to sample the PDF outside the official definition range itself, which is probably going wrong in different ways when you have custom fit ranges.
Let me know if things look better after making these changes, if not feel free to re-attach the updated script again so we can investigate!
Regarding 1: You are absolutely right. I tried to make this example by stripping off unnecessary parts of much larger code, and made this mistake. I fixed it in what follows. Now the behaviour is actually closer to what I’ve observed in my larger code: the fit using RooProdPdf converges, but to rather significantly different values than when fitting the RooFFTConvPdf alone, although the likelihoods differ only by a multiplicative constant.
Regarding 2: True, but that only affected the first fit (of RooFFTConvPdf only), which converged properly in the end. The resets already set positive value. I fixed the initialization though, so in what follows, even the first fit will start from positive value.
Regarding 3: I cannot 100% agree here, for several reasons. Ideally, there would be no bounds to any parameters whatsoever, I therefore prefer to keep my variable ranges as wide as possible, including unphysical values. On one hand, that serves as indication that my likelihood and fit procedure is healthy and robust, on the other, I’m avoiding bias introduced by smaller range. Also, fit results with unphysical values can still be statistically relevant (but of course you’re right this is not the case with sigmas < 0). Also, this cannot explain why fits with essentially identical likelihoods give different results, so I didn’t change this in the attached sript, just so it doesn’t outweigh the effect I’m seeing by chance.
Regarding 4: This sounds like a very good point - I could build the convolution in larger range to have it constructed correctly and then only fit it in the desired 0-12 range (thanks for the tip!). But this cannot explain the problem I’m having (as described in point 1): the fit converging to different result when I build the likelihood with RooProdPdf instead of RooFFTConvPdf alone.
In summary: the fit result is different based on whether I fit the RooFFTConvPdf alone, or in product with completely frozen pdf.
Regarding 3: I cannot 100% agree here, for several reasons. Ideally, there would be no bounds to any parameters whatsoever, I therefore prefer to keep my variable ranges as wide as possible, including unphysical values. On one hand, that serves as indication that my likelihood and fit procedure is healthy and robust, on the other, I’m avoiding bias introduced by smaller range.
It’s unreasonable to expect that the fit is robust for any parameter range, especially not the phase space where the PDF in your fit region is zero. Then also the gradient of the likelihood is zero and it should not be expected that the minimization works. So the rule of thumb is: the parameter ranges should be chosen at least as restrictive as necessary for the PDF to not become zero in the whole fit range.
The parameter range thing is however not the only issue that keeps your fit from converging. And it needs to converge before one can claim that there is a RooFit class that is not working, because in a non-convergent fit the results are arbitrary anyway (or, from a different angle, it’s not a good time investment to fix bugs that only affect non-convergent fits). The other problem in the model is that your parameters erf_mean and res_mean are completely correlated: both are shifting the PDF on the x-axis. The fit result of models with 100 % correlated parameters is not deterministic, so this needs to be fixed too. As the name “resolution” suggests that the Gaussian res is for smearing with a sigma that corresponds to the resolution, I would fix the mean to zero and absorb the former res_mean in the erf_mean parameter (i.e. adding its former value to the initial value of erf_mean).
Finally, there was another problem that I forgot to mention in my first post. When you reset the fit parameters to get the same initial state for the fits, you also have to reset the parameter errors to zero because the errors are used for the initial step size in the minimization.
In summary, when I do these 3 things with your script, the results result_1 and result_2 are identical:
Restrict parameters such that PDF can’t become zero in the fit region
thank you for your answer. You’re right about the correlated parameters.
However, when I try to run the updated version 3 of the script you’ve enclosed, I’m still getting very different behaviour in the two cases: the product fit does not converge (and the fit results are thus very different). The only thing I’ve added to your version was #include RooConstVar.h, so that I can run it compiled (however, both compiled and interpreted way gives the same behaviour). I enclose screenshot of the two RooFitResult printouts.
I’ve tried this in both my favourtie root 6.18 and the currently recommended 6.26, both on lxplus, and the behaviour I see is the same in both versions.
In what root version did you get identical behaviour, please?
I don’t do anything except for connecting to lxplus, sourcing a given ROOT release and run the script I updated. Maybe there are some other things in your environment that could have an influence on the fit? Did you customize your .rootrc or .root_logon files, and you’re sure that 6.26 is sourced correctly? I hope we can still figure this out!
Ah, actually I get the same fit failure when I source 6.26.04 from cvmfs!
So here is what I observer:
pre 6.24.00: fit failure with wrong fit result
6.24.00 to 6.26.02: fit succeeds
6.26.04: fit failure, but this time a different failure where also the NLL is zero for some reason
6.26.06: works again
The second problem that you observe in 6.26.04 is related to the RooAddPdf, because I fixed another bug related to it but the fix was wrong, hence the new bug in 6.26.04. Someone else already reported it then, so I could fix it in 6.26.06 again. So the bug you see is actually not related to the RoFFTConvPdf, but to the RooAddPdf.
So you should be able to use all ROOT releases including and after 6.24.00, except for 6.26.04.
Sorry for the inconvenience! I hope it’s not too much of a problem for you that you can’t use 6.26.04.