RooHistPdf Normalization Different than Source TH1D/RooDataHist

GabrielM · May 8, 2021, 12:54am

Dear RooFit experts,

I’m fairly new to using RooFit, and I had a question about the normalization of a RooHistPdf object vs. a TH1 – sorry if this is obvious!

I’m trying to create a RooHistPdf object to use in a morphing procedure. I’ve been creating a normalized TH1D object which I use to create a RooDataHist data set, from which I then get the pdf. The problem I run into is that I expected that the RooHistPdf should completely match the RooDataHist data set, and consequently the TH1D since I’m explicitly defining the pdf from it. For whatever reason though the normalization between the two is way off. I’m not sure why this is happening or how to fix it, and I was wondering if someone with a better understanding of RooFit could explain. I’m attaching here my code on PyROOT as well as a plot of the mismatch of the TH1D and RooHistPdf. My ROOT version is 6.22.

selection = "((Jet1_M>50 || Jet2_M>50) && Jet1_pT>500 && mVH>1300 && d2V<2.0)"

#Create histogram with mY = 1800 GeV, mX = 110 GeV

file_110 = ROOT.TFile(sample, "read")
tree_110 = file_110.Get("Nominal")               
hist_110 = ROOT.TH1D("hist_110","", 50, 0, 0)
tree_110.Project("hist_110", "mV", selection)
hist_110.Scale(1/hist_110.Integral())

mV = ROOT.RooRealVar("mV","mV", 0, 500, "GeV")

#Here "hd" means histogram data: this creates a data set
#from the histogram above

hd_110 = ROOT.RooDataHist("hd_110", "hd_110",
        ROOT.RooArgList(mV), hist_110)

#This makes the pdf from the RooDataHist

pdf_110 = ROOT.RooHistPdf("pdf_110","pdf_110",ROOT.RooArgSet(mV),
        hd_110, 0)

#Plots all of these
frame = mV.frame()
pdf_110.plotOn(frame)

c = ROOT.TCanvas("c","c", 600, 400)

hist_110.SetLineColor(ROOT.kBlack)
hist_110.Draw()
frame.Draw("SAME")

c.Draw()

Here’s the plot
Norm_question.pdf (14.2 KB)
with blue being the RooHistPdf while black being the TH1. I was expecting both of them to basically overlap. Is there a simple fix/something simple I’m missing? Thank you!

bellenot · May 8, 2021, 10:11am

Welcome to the ROOT Forum! It looks like one is normalized and not the other one. But I’m sure @moneta can give an explanation.

moneta · May 10, 2021, 9:56am

Hi,
Your code looks fine to me. I cannot reproduce this with some similar code.
Please post your full running code, including the input histogram or Tree so I can reproduce the issue

Lorenzo

GabrielM · May 10, 2021, 6:57pm

Hi Lorenzo,

Thanks for the reply! The ROOT file that I’m making the histogram from is too large to add as an attachment, so I’ve put it in my CERNBox. You can access it with this link. The python script is there as well, but I’m also attaching it here:

forMoneta.py (1.3 KB)

Thanks once again for looking into this.

system · May 24, 2021, 6:57pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

moneta · May 26, 2021, 4:31pm

Hi,
Sorry for my late reply. From your code I could see the problem. The reason of the difference is the following. When a Pdf is plotted in RooFit is normalised taking into account the bin width of the used frame. Now the frame you are using by default is having 100 bins, and therefore will use a bin width of 5 GeV. Now the histogram instead is having 50 bins (i.e. a bin width of 10 GeV). This explains the difference. The solution is to set the number of bins to be 50, by doing:

mV.setBIns(50)
frame = mV.frame()

Also, if you will plot the RooDataHist before the RooHistPdf the normalisation will be done correctly, because automatically a pdf in RooFit will normalised to the data that are plotted before

Best

Lorenzo
If you would have plot