Roofit: PDFs with different domain in the same model (e.g. gauss + lognormal)

malfonsi79 · December 2, 2025, 4:10pm

Hi all,

I would like to ask how to make (and then fit to data) a Roofit model that combines model with different domain for the independent variable. For example a simple RooAddPdf between a RooLognormal, which is defined only for positive x, and a gaussian that is defined in the whole axis. Of course I am looking for the case when the gaussian is basically around x=0 and extends to negative x:

As you see I have no problem to make gauss + gauss, but as soon as I try to model the red component with the Roofit log-normal pdf (which should better describe the physical process in theory), I start to get complains that this component is attempted to be evaluated below zero, even if I restrict the fit to positive value (you notice in the picture that the blue line starts from 1.0 also in the 2 gaussian example).

I have no problem (well, after some learning to figure out some reasonable choice of the median and scale parameter) to plot, generate and fit a RooLognormal alone, it is really when I start to mix up with other distributions.

Looking at the code ( ROOT: roofit/roofit/src/RooLognormal.cxx Source File ) Roofit relies for the pdf evaluation on ROOT::Math::log_normal_pdf, which puts to zero the pdf for x<0 ( ROOT: math/mathcore/inc/Math/PdfFuncMathCore.h Source File ), but maybe Roofit intercepts this, because I understood that for the logL calculation it does not want zero probability.

So how do you deal with such cases? It seems fairly common to me.

I read that for the FFTConvPdf Roofit implements a shift of a pdf w.r.t. to the other. Can I try something on the same track? Maybe the gaussian depends on a different variable that is just shifted or so?

Any suggestion is really appreciated!

siliataider · December 3, 2025, 9:58am

Hi @malfonsi79 , thanks for reporting your issue on the forum! Maybe @jonas can maybe help?

malfonsi79 · December 9, 2025, 9:41am

Hi all,

Meanwhile I discovered that what is giving problem is the analytical integration method provided, which does not support negative domain even though the evaluate method supports it.

In fact if I modify it with this workaround:

class AdHocLognormal : public RooLognormal
{
public:
  AdHocLognormal( const char *name, const char *title, 
                  RooAbsReal &_x, RooAbsReal &_m0, RooAbsReal &_k)
  : RooLognormal{name, title, _x, _m0, _k}
  { }
  AdHocLognormal(const AdHocLognormal &other, const char *name = nullptr)
  : RooLognormal{other, name}
  { }
  TObject *clone(const char *newname) const override { return new AdHocLognormal(*this, newname); }

  // Correct this calculation to allow for negative range extension
  double analyticalIntegral(Int_t code, const char *rangeName = nullptr) const override
  {
    static const double root2 = std::sqrt(2.);
        
    double ln_k = std::abs(_useStandardParametrization ? k : std::log(k));
    double scaledMax = _useStandardParametrization ? std::log(x.max(rangeName)) - m0 : std::log(x.max(rangeName) / m0);
    if ( x.min(rangeName) <= 0. )  // <--- here is the modification
      return 0.5 * (RooMath::erf(scaledMax / (root2 * ln_k)) + 1.);
    double scaledMin = _useStandardParametrization ? std::log(x.min(rangeName)) - m0 : std::log(x.min(rangeName) / m0);
    return 0.5 * (RooMath::erf(scaledMax / (root2 * ln_k)) - RooMath::erf(scaledMin / (root2 * ln_k)));
  }
};

I can have a functional lognormal distribution valid on the whole real axis, probability density is just zero in the negative half of the axis. Likewise, as soon as the range is at least partially positive, also the normalization can be calculated analytically, the negative fraction just adds 0 to the integral.

I verified that the code can (see picture below) plot the lognormal with a negative range (dashed blue), convolve (blue) it with the (dash green) gaussian, convolve it with a clone of itself (dash violet) and then also with the gaussian (violet). I can also combine all of them in a RooAddPdf (not in the plot, I can add it).

So now the question is: was there an oversight in the analytical integral code, or is there a specific theoretical/statistics reason why you want to forbid the definition over the full real axis? (You know, you get the warning when you define the Pdf that the “safe” range is ]0;+inf[ )

Thanks again for any feedback,
Matteo

jonas · December 9, 2025, 1:30pm

Thanks a lot for the investigation! No, there is no good reason for that. In fact, it is inconsistent with other pdfs where the simulation is similar, like the RooLandau.

I’m fixing it and make sure that the fix also lands in the 6.38.02 patch release:

github.com/root-project/root

[RF] Fix RooLognormal analytical integral for xMin < zero

master ← guitargeek:lognormal

opened 01:29PM - 09 Dec 25 UTC

guitargeek

+5 -4

Just like for other pdfs that are strictly zero for negative observable values, …the integral for xMin < zero should still evaluate correctly (compare for example with the RooLandau). Addresses this forum report: https://root-forum.cern.ch/t/roofit-pdfs-with-different-domain-in-the-same-model-e-g-gauss-lognormal/64505/3 Little demo that it works now: ```c++ RooRealVar x("x", "x", 0.0, 10.0); RooRealVar mu("mu", "mu", 1.0); RooRealVar sigma("sigma", "sigma", 0.5); RooLognormal logn("logn", "lognormal pdf", x, mu, sigma); double a = -1.0; double b = 5.0; x.setRange("intRange", -1.0, 5.0); RooAbsReal* integral = logn.createIntegral(x, RooFit::NormSet(x), RooFit::Range("intRange")); double result = integral->getVal(); std::cout << "Integral of Lognormal from " << a << " to " << b << " = " << result << std::endl; ```

Cheers,
Jonas

malfonsi79 · December 22, 2025, 6:38pm

Ok great I am happy that I have been useful.

Strictly related, also RooGamma yells about negative x although it returns 0 anyways. The source is TMath::GammaDist, which does that.

However I would like to ask for more help because a similar problem seems to occur again during the fitting procedure. It is a long post summarizing my attempts and investigation of the last weeks, but just before the Christmas holiday it is meant to be discussed (hopefully) only on next year when I am also back from holidays (I will make sure that it does not get closed in the first week of January)

I have build up a model composing either a set of log-normal distributions or a set gamma distributions (convoluted with resolution gaussian etc)

// Model for the case of log-normal peaks model
RooRealVar nbkg("nbkg","background events",300000,1.,1000000.) ;
RooRealVar nsig("nsig","signal events",15000,1.,100000.) ;
RooRealVar fdouble("fdouble","fraction of double to single events",0.02,0.001,0.1) ;
RooFormulaVar ndouble("ndouble","nsig*fdouble",{nsig,fdouble}) ;
RooRealVar fd1miss("fd1miss","fraction of d1 miss to single events",0.3,0.001,0.8) ;
RooFormulaVar nd1miss("nd1miss","nsig*fd1miss",{nsig,fd1miss}) ;
RooAddPdf model("model","model",
                RooArgList(gauss,lnxg,lnxlnxg,lognormal_u),
                RooArgList(nbkg,nsig,ndouble,nd1miss)) ;

From the “old 2009” RooFit manual, I understand that when the number of coefficients is the same of the number of component pdfs, the model is assumed an extended pdf model with coefficients used as events count rather than probability fractions. Therefore an extended likelihood is built for fitting procedures, which is fine, it is what I want.

I believe that I have set reasonable starting values:

and I have set small errors on parameters to ensure that Minuit does not immediately jumps away from here.

However any fit attempt

std::unique_ptr<RooFitResult> roofitres {
    model.fitTo(spe, RooFit::Save() , RooFit::Range( -0.5,5.5) )
};

fails with a long list of:

Returning maximum FCN so far (-inf) to force MIGRAD to back out of this region. Error log follows.
Parameter values: 	fd1miss=0.7	fdouble=0.05	medianln=1.35	nbkg=300000	nsig=50000	ratio_expectval_ln=0.5	sg=0.03	shapeln=1.3	shapeln_u=1.8
RooAbsPdf::lognormal_u_over_lognormal_u_Int[x][ numerator=lognormal_u denominator=lognormal_u_Int[x] ]
     p.d.f value is Not-a-Number @ numerator=lognormal_u=7.61465e-06, denominator=lognormal_u_Int[x]=0.999996   
     [ ... same lines ... ]
RooAddPdf::model[ nbkg * gauss_over_gauss_Int[x] + nsig * lnxg_over_lnxg_Int[x] + ndouble * lnxlnxg_over_lnxlnxg_Int[x] + nd1miss * lognormal_u_over_lognormal_u_Int[x] ]
     getLogVal() top-level p.d.f evaluates to NaN @ !refCoefNorm=(x = 7.5), !pdfs=(gauss_over_gauss_Int[x] = 0,lnxg_over_lnxg_Int[x] = 1.07732e-10,lnxlnxg_over_lnxlnxg_Int[x] = 1.8821e-07,lognormal_u_over_lognormal_u_Int[x] = 7.61469e-06), !coefficients=(nbkg = 300000 +/- 1000,nsig = 50000 +/- 1000,ndouble = 2500,nd1miss = 35000)
    [ ... same lines ... ]

My understanding of this log is that already with the starting parameters the NLL cannot be calculated, or maybe is log(0). Maybe lines 5-8 tells that the lognormal_uis the culprit, but I do not understand what it means with the numerator and denominator values (they seems valid numbers, so the ratio should be valid too). Also !refCoefNorm=(x = 7.5) seems to suggest that calculations are made over the full range, despite the requested fit range was up to 5.5 …

To understand what is happening I build and inspect directly the constructed NLL:

auto model_nll = model.createNLL(spe, RooFit::Range(-0.5,5.5), RooFit::Extended(true));
//I tried various ranges and the Extended option true and false, although from the manual I got that the default is Extended because the composite pdf is extended
model_nll->getVal()

The last line, that should give me the NLL at the current (starting) values of the parameters, produces the same errors.

However the model has valid and non-zero values over the full range used:

 for (double xval = -1. ; xval < 8. ; xval += 0.05) {
    x = xval;
    RooArgSet norm_set{x}; //not sure how to indicate a different x range here
    cout << model.getVal({x}) << "\t" << model.getLogVal(&norm_set) << endl;
}

So I have to understand how this binned log likelihood is calculated (continue on the next post)

malfonsi79 · December 22, 2025, 10:25pm

(continues from previous post)

It is difficult to track down the piece of code that really calculates the NLL for my case, because, at the end, classes delegate the effective calculation to free functions in some implementation namespace and doxygen is not so good is creating the cross-links. There are classes named RooNLLVar and RooNLLVarNew but at the end they were not the used ones (they seems used for some other purpose.

My understanding is that createNLL() calls the specific createNLLImpl() of the pdf class. RooAddPdf does not override the basic behavior of RooAbsPdf, which call the free function RooFit::FitHelpers::createNLL() . [note older ROOT version - like on my old laptop, driving me crazy - were doing differently, but let’s focus on latest version].

Inspecting FitHelpers.cxx file (here is where doxygen fails cross-linking) you finds that this call builds the object RooFit::TestStatistics::RooRealL , which embed (and whose evaluate() uses) a RooFit::TestStatistics::RooBinnedL instance. This is, finally, the function where the log of probabilities are accumulated, see ROOT: roofit/roofitcore/src/TestStatistics/RooBinnedL.cxx Source File

Here the only special error message is NOT the message that I receive, so the error should come really from the evaluation of the pdf - note that the pdf is not evaluated in log space. But this is in contradiction with my last check in the previous post, where I have verified that the composite pdf is positive over the full range (I repeated the test with a step smaller that the histogram bin width).

After the holidays I will investigate also on my own RooAddPdf , I am not sure to have fully understood how these coefficients are managed.

Merry Christmas and Happy discussion on Roofit in the New Year

Matteo

malfonsi79 · January 4, 2026, 10:08am

This is to extend the duration of the post over the holiday period.

malfonsi79 · January 17, 2026, 10:16pm

Dear Jonas et al.,

I have done several new tests, also using fractional coefficients or recursively built models. I can fit in some cases, but only in positive sub-ranges and the fit fails when I approach or go deeper to negative x.

But here I would like to focus only to a single question, related to the topic of this post, on the print-out or the information from calls to model_nll->getVal() or model_nll->Print("V"). As mentioned in the previous message, model_nll is built with the initial parameters, and I proved that model can be calculated in all the range and no value is negative with these parameters, and they are never changed because I make this check before starting any fit.

I cannot understand the meaning of the message:

[#0] ERROR:Eval -- RooAbsReal::logEvalError(gammadistr_u_over_gammadistr_u_Int[x]) evaluation error, 
 origin       : RooAbsPdf::gammadistr_u_over_gammadistr_u_Int[x][ numerator=gammadistr_u denominator=gammadistr_u_Int[x] ]
 message      : p.d.f value is Not-a-Number
 server values: numerator=gammadistr_u=3.86965e-15, denominator=gammadistr_u_Int[x]=1

Is it saying that for this RooAddPdf component the value 3.86965e-15 divided the normalization 1.0 results in NaN??? or does it mean something else?

Note that since I patched the gamma distribution class (or the lognormal), I am afraid that I forgot to patch some relevant method of these classes

I look forward to your feeback.

Thanks,
Matteo

malfonsi79 · January 22, 2026, 12:30pm

Dear @jonas et al.,

The reason of the discrepancy is:

Fits and NLL calculations follows this separate lane, developed for fast / optimized computations.

This is confirmed by the fact that if I use the EvalBackend("legacy") option, the NLL evaluates correctly to a valid number instead of NaN, and fits are successful without too much initial fine-tuning, although they take much (and a very visible much) longer.

The existence of this optimized computational lane is not really advertised and in particular the fact that the two computational lanes need to be manually synchronized is not very well documented. I am a bit surprised that the patch that you introduced in the first answer does not also modify also the optimized lane, because now the two lanes would be inconsistent each other (and it is really difficult and frustrating to figure out the reason).

How can I patch also this lane? According to the web page one have to override a method called computeBatch, but probably the page is some preliminary version of the document because this method does not exist in the full RooFit hierarchy (6.34.04).

Is now doEval() that is doing this work, passing the free function to be used? Should this free function necessarily be in namespace RooBatchCompute (and with sub-namespace for each architecture… this seems complicate…)? Which role has RooBatchCompute::getFunctions() and the fact that my free function is not there would jeopardize any attempt?

For my specific case there is a tricky detail: it seems that most of these free functions return normalized probabilities already in log space. An exception is computeAddPdf, so there should be some master function that knows what the others return in their output and acts accordingly.

Is there a way to make aware the framework that my custom AdHocComputeLognormal() and AdHocComputeGamma() free functions return probabilities in linear space, so that also a value of 0 is acceptable (they are components of RooFFTConv and/or RooAddPdf, so the value in linear space is needed anyway) ? Without that, would be a viable workaround returning std::numeric_limits<double>::lowest() instead of negative infinity ?

Thanks a lot in advance for any feedback.

Matteo

system · February 5, 2026, 12:31pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.