Large time in fitting convoluted PDF

hym · July 2, 2020, 5:47pm

Hello,
I have a question regarding convolution and fitting. I have a RooAddPdf with 3 Generic pdfs. When I convolute this with another pdf and fit it to the data, it takes around 2-3 hrs to fit and converge, while if I fit the RooAddPdf to the same data without convolution, it gets converged instantly.
Could you suggest me some possible solution to this problem?

thanks
Himanshu

jblomer · July 3, 2020, 8:13am

@StephanH Can you give a suggestion?

StephanH · July 3, 2020, 9:02am

How do you run the convolution? If you use
Conv(a, b) + Conv(c, b) + Conv(d, b), use the distributivity of the convolution to reorder to
(a+b+c) * b.

Further, if you use the generic Pdf, each PDF has to run a numeric integral to normalise your function. Do the components need to be normalised before they are summed? If not, we can optimise quite a bit.
The computation that runs is something like:

for term in sum:
  for bin in term:
    - evaluate function
      for function value in range of function:
        - eval function
        - run numeric integral of function to normalise
          - (evaluate function at multiple points to get integral)
    - do fourier transformation of function
    - multiply functions in fourier space
    - do backtransformation

You can see that this can quickly go out of hand.

So, if your components don’t need to be normalised before summing them, convert the generic pdfs to RooFormulaVar. These are not running normalisations - they only evaluate your function.
Then, use a RooRealSumPdf to sum them. In contrast to RooAddPdf, it takes any function as terms, not only PDFs. That means that they don’t need to be normalised.
After summing all the terms with the SumPdf (which will run only one numeric integral to normalise the sum of your three functions), do the convolution.

Like this, you will only run one numeric integral and one numeric convolution per step instead of three.

Also, do you use the FFT convolution? How many bins? The more bins, the slower, but it’s more accurate. Read the documentation regarding the FFT and the number of bins to fine-tune. But do this after you reordered the computations, because that might make it so fast that the number of bins doesn’t matter.

hym · July 6, 2020, 2:38pm

Thank you @StephanH for the nice and informative answer.
I tried the RooFormulaVar and the distributive property of the convolution but when I chi2fit it to the data then it goes on for a long and prints the following errors recursively -

[#0] ERROR:Eval – RooChi2Var::RooChi2Var(chi2_convpdf_xbkg_Xbkg) INFINITY ERROR: bin 0 has zero error
[#0] ERROR:Eval – RooMath::interpolate ERROR: zero distance between points not allowed

and if I draw the functions without fits, they looks like in the image. and they looks fine to me -

Red - sum (a + b + c)
Blue - Resolution function ( R )
Green - sum * Resolution function ( (a+b+c) * R ) = convoluted function

When I fit to the distribution I do use setBins() option and tried with 10,100,500 bins.

Is there any suggestions I should try?

Thanks

StephanH · July 7, 2020, 7:20am

I can imagine what the zero error is:
Your fit functions may fall to zero, and therefore the error may also be zero. Have you tried fitTo instead of chi2FitTo or have you tried to limit the range of the fit variable? After all, it seems to be only in bin 0.

system · July 21, 2020, 7:20am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.