Roofit: PDFs with different domain in the same model (e.g. gauss + lognormal)

Hi all,

I would like to ask how to make (and then fit to data) a Roofit model that combines model with different domain for the independent variable. For example a simple RooAddPdf between a RooLognormal, which is defined only for positive x, and a gaussian that is defined in the whole axis. Of course I am looking for the case when the gaussian is basically around x=0 and extends to negative x:

As you see I have no problem to make gauss + gauss, but as soon as I try to model the red component with the Roofit log-normal pdf (which should better describe the physical process in theory), I start to get complains that this component is attempted to be evaluated below zero, even if I restrict the fit to positive value (you notice in the picture that the blue line starts from 1.0 also in the 2 gaussian example).

I have no problem (well, after some learning to figure out some reasonable choice of the median and scale parameter) to plot, generate and fit a RooLognormal alone, it is really when I start to mix up with other distributions.

Looking at the code ( ROOT: roofit/roofit/src/RooLognormal.cxx Source File ) Roofit relies for the pdf evaluation on ROOT::Math::log_normal_pdf, which puts to zero the pdf for x<0 ( ROOT: math/mathcore/inc/Math/PdfFuncMathCore.h Source File ), but maybe Roofit intercepts this, because I understood that for the logL calculation it does not want zero probability.

So how do you deal with such cases? It seems fairly common to me.

I read that for the FFTConvPdf Roofit implements a shift of a pdf w.r.t. to the other. Can I try something on the same track? Maybe the gaussian depends on a different variable that is just shifted or so?

Any suggestion is really appreciated!

Hi @malfonsi79 , thanks for reporting your issue on the forum! Maybe @jonas can maybe help?

Hi all,

Meanwhile I discovered that what is giving problem is the analytical integration method provided, which does not support negative domain even though the evaluate method supports it.

In fact if I modify it with this workaround:

class AdHocLognormal : public RooLognormal
{
public:
  AdHocLognormal( const char *name, const char *title, 
                  RooAbsReal &_x, RooAbsReal &_m0, RooAbsReal &_k)
  : RooLognormal{name, title, _x, _m0, _k}
  { }
  AdHocLognormal(const AdHocLognormal &other, const char *name = nullptr)
  : RooLognormal{other, name}
  { }
  TObject *clone(const char *newname) const override { return new AdHocLognormal(*this, newname); }

  // Correct this calculation to allow for negative range extension
  double analyticalIntegral(Int_t code, const char *rangeName = nullptr) const override
  {
    static const double root2 = std::sqrt(2.);
    bool _useStandardParametrization = false;
    
    double ln_k = std::abs(_useStandardParametrization ? k : std::log(k));
    double scaledMax = _useStandardParametrization ? std::log(x.max(rangeName)) - m0 : std::log(x.max(rangeName) / m0);
    if ( x.min(rangeName) <= 0. )  // <--- here is the modification
      return 0.5 * (RooMath::erf(scaledMax / (root2 * ln_k)) + 1.);
    double scaledMin = _useStandardParametrization ? std::log(x.min(rangeName)) - m0 : std::log(x.min(rangeName) / m0);
    return 0.5 * (RooMath::erf(scaledMax / (root2 * ln_k)) - RooMath::erf(scaledMin / (root2 * ln_k)));
  }
};

I can have a functional lognormal distribution valid on the whole real axis, probability density is just zero in the negative half of the axis. Likewise, as soon as the range is at least partially positive, also the normalization can be calculated analytically, the negative fraction just adds 0 to the integral.

I verified that the code can (see picture below) plot the lognormal with a negative range (dashed blue), convolve (blue) it with the (dash green) gaussian, convolve it with a clone of itself (dash violet) and then also with the gaussian (violet). I can also combine all of them in a RooAddPdf (not in the plot, I can add it).

So now the question is: was there an oversight in the analytical integral code, or is there a specific theoretical/statistics reason why you want to forbid the definition over the full real axis? (You know, you get the warning when you define the Pdf that the “safe” range is ]0;+inf[ )

Thanks again for any feedback,
Matteo