How to handle negative pdf

Dear experts.

In our search, the signal shape is like a Sine function (see attachment-I). So the pdf can be negative. We noticed when do RooDataHist=> RooHistPdf, all the negative bins have been set to be zero. We wrote a small macro the prove this. So we create a simple histogram with some negative bins:
content[10] = {10, 4, -0.5, -3.5, -3, -2.8, -1, 0, 3, 0};
The the output pdf shape as in attachment-II, clearly all negative bins are set to be 0. May I ask how to make the negative values to be supported in pdf ?

Thanks a lot!
Javier

HI,

Your total pdf cannot be negative. Maybe you should just do the the modelling using a function (a RooAbsReal) which can be of course negative.

Lorenzo

Dear Lorenzo

We are using HistFactory. would you know how to give a RooAbsReal to HistFactory to describe the signal ?

Thanks a lot!
Javier

Hi Javier,

I am sorry I don’t know this. Probably you would need to modify yourself the HIstFactory code

Lorenzo

Dear Lorenzo

I see. Thanks a lot!

Javier

Dear experts,

I am attempting the same type of fit described here, where I am using a sum of RooHistPdf templates within a RooRealSumPdf.

It seems that RooRealSumPdf can handle negative coefficient values, but not negative bin content in the histogram templates - the fit breaks down when I introduce templates which are negative in certain bins, whereas it works fine with templates that are positive everywhere.

My histograms parameterise sine and cosine terms in an angular distributions, and can thus be negative in certain regions. I use histograms rather than a RooGenericPdf function in order to model the angular resolution.

Is there any way to fit using histogram templates with negative bin content in ROOT? The sum of all of my histograms is positive everywhere, so my hope was that something like RooRealSumPdf would work.

Cheers,
Donal

Hello Donal,

the RooHistPdf forces the bin content to zero when it is negative, as @Javier found out. This happens because a probability can obviously not be negative. I understand that when you add the HistPdf to other PDFs, this would in total yield a positive PDF, and can thus considered to be a limitation of the PDF implementation. Disabling this, however, could break things in RooFit that I cannot oversee at the moment.

Edit:
The solution could be this: The RooHistFunc can be used as a replacement for the HistPdf. The difference is that a PDF cannot be negative and is normalised, whereas a function doesn’t need to satisfy any of the two conditions.
You can then combine the HistFunctions using the RooRealSumPdf, yielding a (non-negative, normalised) PDF.

Edit 2:
Note that you only need N-1 coefficients for N functions due to the normalisation constraints. The last coefficient is automatically calculated by RooFit.

Hi Stephan,

Thanks a lot for the suggestion, it sounds promising! I’ll give it a try and report back.

Cheers,
Donal

Hi again,

@StephanH you were correct, RooHistFunc has done exactly what I needed! The fit successfully runs with component histograms which have negative bin content.

Also worth pointing out that RooRealSumPdf seems to work well for me with N input functions with (N-1) coefficients, whereas it doesn’t behave so well with N functions and N coefficients.

Cheers,
Donal

Hi Donal,

Yes, that makes sense. I actually touched the documentation yesterday, when I looked it up.
The reason for N-1 coefficients is that due to the normalisation condition (this is a constraint for the coefficients), the last coefficient will be calculated from the other coefficients, such that the overall PDF can be normalised by rescaling the coefficients.
I don’t know what the PDF does if you provide N coefficients. Could you elaborate? Maybe I should change it such that it issues a warning or refuses to use N coefficients.

Hi Stefan,

The most stable approach I have found is to manually specify the Nth coefficient as (1 - all others) within a RooFormulaVar. This doesn’t give me the “[#0] WARNING:Eval – RooRealSumPdf::evaluate(pdf WARNING: sum of FUNC coefficients not in range [0-1], value=1.0001” error that using N functions and N-1 coefficients gives. The warning message doesn’t always prevent the fit from converging, but sometimes it has done. I don’t have any such issues when using the RooFormulaVar as my Nth coefficient.

When I used N functions and N RooRealVar coefficients, the coefficient parameters would reach limits and the fit would not converge. I guess the RooRealSumPdf just wasn’t properly normalised in such an instance?

Cheers,
Donal

Yes, when the coefficients go out of control, it’s not being normalised properly. That, however, should not happen with N-1 coefficients, because 1-sum(coeff) is precisely what should be done internally.

Hi @StephanH,

Thanks again for your help with RooRealSumPdf. I have since been trying to use 4D RooHistFuncs as input to RooRealSumPdf (extending my work from 3D), but I am having issues with the PDF normalisation. I have attached a short test script which:

  1. Reads in 12 component RooHistFuncs from ROOT files
  2. Adds them together in a RooRealSumPdf using N - 1 coefficients (12th coeff is required to be 1 - all others)
  3. Generates a toy dataset from the RooRealSumPdf
  4. Fits the toy data with the RooRealSumPdf and plots the results

I see that the PDF is not plotted correctly normalised, even if the RooFitResult itself doesn’t look so bad. I also saw the same issue using a 3D RooRealSumPdf, but in this case I fixed it by doing:

pdf.forceNumInt()

This is the same logic that HistFactory uses internally. Unfortunately using this line leads to errors when I use a 4D RooRealSumPdf. Any advice as to what might be going wrong would be very helpful!

Cheers,
Donal

4D_RooRealSumPdf_Test.zip (180.4 KB)

Hello Donal,

I guess the problem is the RooBinIntegrator. It looks like it only handles 1, 2, and 3-dimensional distributions.

  • You could try other integrators (that are probably much slower, because they don’t understand that the distribution is binned). Possible options (untested):
    – RooAdaptiveIntegratorND
    – RooMCIntegrator
  • You could try to extend the bin integrator to N dimensions, and maybe create a pull request if you succeed. It looks like you “only” need to reproduce the code that is already there for 1 to 3 dimensions.

I created a JIRA ticket about the bin integrator:
https://sft.its.cern.ch/jira/browse/ROOT-9996

I cannot dedicate any time to this before end of March, unfortunately.

Hi @StephanH,

Thanks a lot for the reply! I tried RooAdaptiveIntegratorND and RooMCIntegrator, but I see the same behaviour when fitting and plotting with those. They also give the same errors as RooBinIntegrator when I use pdf.forceNumInt(). So the problem is two-fold:

  1. Without pdf.forceNumInt(), the RooRealSumPdf doesn’t look correctly normalised.
  2. With pdf.forceNumInt(), it returns errors in 4D.

Just to reiterate that a 3D RooRealSumPdf also requires forceNumInt() in order to display correctly, but it does work. It seems that the extension to 4D using forceNumInt() is the source of the issue.

I will keep an eye on the JIRA - thanks for setting it up!

Cheers,
Donal