Plotting sum of extended pdfs

Dear all,

I have been trying to plot the sum of individual extended pdfs, where the fit has converged successfully, and the individual pdfs also give a nice plot on their separate datasets, but during plotting I am unable to get a clear plot.
log.txt (256.9 KB)
please find the attached log file below, and the images below, and the minimal code
minimal_code.txt (952 Bytes)

the images are in the order of , only the pdfs, only the dataset points, both

Any help would be really appreciated!
Thanks in advance!

tagging @jonas

Hello,

Thanks for the post.
We are maybe still missing a minimal standalone reproducer: could you post it?

Best,
D

Hello,thank you for the reply!

Please find the sample.root file ( a Rooworksapce) and the minimal python code attached here in a zip file
sample_1.zip (527.0 KB)

thank you!

typing here to keep this space alive

Hello @Vikas_Raj,

@jonas is very busy in the last two weeks, so let me ask what you want to do. I’m not sure what “during plotting I am unable to get a clear plot” means.
I see that you successfully created some plots, but I don’t understand what’s missing to complete the task. Could you try to explain again in different words?

@StephanH
Thank you for the reply.
Here what I am trying to say is I am trying to overlay and plot the sum of 2 extended pdfs, for B0 and B0bar over their combined datasets using

# Plot B0 tag (index 0)
datasets[0].plotOn(frame, R.RooFit.Binning(nbins), R.RooFit.MarkerColor(R.kRed))
pdfs[0].plotOn(frame, R.RooFit.ProjWData(R.RooArgSet(*cond_vars), datasets[0]), R.RooFit.LineColor(R.kRed))

# Plot B0bar tag (index 1)
datasets[1].plotOn(frame, R.RooFit.Binning(nbins), R.RooFit.MarkerColor(R.kBlue))
pdfs[1].plotOn(frame, R.RooFit.ProjWData(R.RooArgSet(*cond_vars), datasets[1]), R.RooFit.LineColor(R.kBlue))

Now, if I plot only the pds[0,1] I get the first plot.
If I plot only the datasets[0,1], I get the second plot.
But if I plot them together, I get the error(in the .zip file, along with the minimal example) and the 3rd plot, even though the fit has converged and matrix is fully accurate!

So, I am guessing it is a plotting issue with roofit

Also, i tried plotting the signal and background only pdfs on their respective datsets, and they are fine (not shown here).

I have tried
normalization using

datasets[0].plotOn(frame, R.RooFit.Binning(nbins), R.RooFit.MarkerColor(R.kRed), R.RooFit.Normalization(1, R.RooAbsReal.NumEvent))

or scaling the pdf using

pdfs[0].plotOn(frame,
R.RooFit.ProjWData(R.RooArgSet(*cond_vars), datasets[0]),
R.RooFit.Normalization(datasets[0].sumEntries(), R.RooAbsReal.NumEvent),
.RooFit.LineColor(R.kRed))

but to no vain

OK, I see now what you mean. To normalise the PDFs to the data in the plots, RooFit needs to integrate them. It looks like in doing that, it reached an unphysical region where the PDF is not defined, and therefore results in not-a-number.
Could it be that choosing a different range for deltat helps? Or is one of the parameters in an invalid region? You can try to evaluate the PDF before plotting it, or you can try to print it using Print(“V”) or Print(“t”) to see what’s going on.

Another way is to reparametrise the PDFs if there is un unstable term that might send it to nan or infinity.

Could this be the reason for NaN?

print(f"deltat: [{reso_ws.var('deltat').getMin()}, {reso_ws.var('deltat').getMax()}], value = {reso_ws.var('deltat').getVal()}")

deltat: [-15.0, 15.0], value = 0.0
reso_ws.var('deltat').Print('v')

deltat_verbose.txt (31.5 KB)
This seems fine.

Hello,
deltat being 0 currently, and having a range from -15 to 15 look fine. The question is if the PDF is well-defined on that entire range for the current values of the parameters. When it’s integrated (numerically), the integrator will test several points in that range. So if for some parameter values and some value of deltat the PDF is ill-defined, the integral cannot be performed.

Did you also try the tree printing? If I’m not mistaken, this also prints the current values.
See for example the roofit 206 tutorial. When you print the expression as a tree, you can see the values of the sub-expressions. We need to check if one of them is ill-defined for the values that the integrator was evaluating.

Hello
Thank you for pointing it out, and indeed there are some nan status showing in the output

    [0][1]0x561fd83f4740 RooRealIntegral::sc_pdf_bartag_Int[deltat,deltaterr] = -nan [ADIRTY] 
    [1][1]0x561fd83f4740 RooRealIntegral::sc_pdf_bartag_Int[deltat,deltaterr] = -nan [ADIRTY] 
    [1]0x561fd83f4740 RooRealIntegral::sc_pdf_bartag_Int[deltat,deltaterr] = -nan [ADIRTY] ...

Full .txt
pdf_full_td_bartag.txt (579.1 KB)
similar case for btag too

the pdfs in question are

# define signal conditional pdf
reso_ws.factory("""
    RooBDecay::sc_pdf_btag(
        deltat,
        B_tau[1.534],
        zero[0],
        coshcb,
        zero,
        coscb,
        sincb,
        bmix_dm[0.507],
        res_core,
        DoubleSided
    )

""")

reso_ws.factory("""
    RooBDecay::sc_pdf_bartag(
        deltat,
        B_tau,
        zero,
        coshcbbar,
        zero,
        coscbbar,
        sincbbar,
        bmix_dm,
        res_core,
        DoubleSided
    )

""")

Hello @Vikas_Raj,

OK, we are one step closer to the solution, but it’s not yet the solution:
In this snippet, only the integrals are nan, but we already knew that. The PDFs seem to be OK. But it looks like when passing one of the deltat values from the range [-15,15], the PDFs might be ill-defined. Could you check manually, by setting deltat to a few values and evaluating the PDFs?
This could show which sub-expression of those PDFs is ill-defined.

Hello

I have tried as you have suggested( I hope…)

import numpy as np

deltat      = reso_ws.var("deltat")
deltaterr   = reso_ws.var("deltaterr")
r           = reso_ws.var("r")
mod_mbc     = reso_ws.var("mod_mbc")
csobdtprime = reso_ws.var("csobdtprime")
deltae      = reso_ws.var("deltae")

pdf = reso_ws.pdf("pdf_full_td_btag")

with open("scan_log.txt", "w") as logfile:
    scan_vals = np.linspace(-15, 15, 300)
    for val in scan_vals:
        deltat.setVal(val)
        y = pdf.getVal(R.RooArgSet(deltat, deltaterr, r, mod_mbc, csobdtprime, deltae))
        if not np.isfinite(y):
            logfile.write(f"WARNING: NaN or Inf at deltat = {val}\n")

and it seems like it fails for all values of deltat !

WARNING: NaN or Inf at deltat = -15.0
WARNING: NaN or Inf at deltat = -14.899665551839465.
.
.
.
WARNING: NaN or Inf at deltat = 14.899665551839465
WARNING: NaN or Inf at deltat = 15.0

scan_log.txt (14.9 KB)

That’s because getVal(setToNormaliseOver) still has to run the integral, so you will get the NaNs during the integration.
Try getVal(), because this doesn’t normalise (i.e. integrate).

I am confused here if you mean by replacing

y = pdf.getVal(R.RooArgSet(deltat, deltaterr, r, mod_mbc, csobdtprime, deltae))

with only

y = pdf.getVal()

It still shows the same errors that I’ve shared in the log file.

Hmm, that’s interesting. I think this is the point where we need an example that can be run. Can you share a code snippet that declares the PDF that’s used in the fit?
We don’t need any data to test it.

Edit: I see that you attached it in a zip file. We will have a look at it next week, since @jonas is not available today.

Typing here to keep this space alive

I guess @jonas might have a look.

Hello, I have slightly adjusted my code for better reproducibility!
sample_2v.zip (530.4 KB)