DataHist does not match original TH1D

The filled DataHist from a TH1 has a different distribution with respect to the original TH1 one.
Also, when looking at specific values using .GetBinContent() for TH1 and .weight() for DataHist the output is different. The binning is the same.

What am I doing wrong? They are supposed to be the same right?

Thanks,
Imanol

Hi,

Normally the RDataHist normalise the histogram bin content by the bin width to get a density. This might explain the difference.

Lorenzo

Neither the TH1 nor DataHist are normalised. The binning is the same

OK, then can you please post the code showing the problem?

Lorenzo

It is basically something like this:

# Booking TH1D from RDataFrame
pt_h1D = rdf.Histo1D(h1D_pt_model,  "prompt_mcp_pt" )  

# Building RealVar
pT  = ROOT.RooRealVar("pT", "pT", 0, 8000)

# Building DataHist
dh = ROOT.RooDataHist("dh", "dh", [pT], Import=(pt_h1D.GetPtr(), False)) 

# Ploting
dh.plotOn(frame)

# This gives different results
pt_h1D.Draw("SAME")
frame.Draw("SAME")

# Checking values
print("------------GET-VALUE-----------", data.weight(5), pt_h1D.GetBinContent[5]) # This gives different results

Hi,
What is pT_array[5]? Note that the indexing of RooDataHist starts from zero, while for ROOT histograms starts from 1. The bin with index=0 is the underflow one

Lorenzo

1 Like

Sorry, I mean

pt_h1D.GetBinContent(5)

I see. But the weird thing is that when I plot the DataHist with plotOn it shows different values than when using .weight() method. Can you explain to me why? Also, I would like to know how to disable the normalization option in the plotOn method.

From the weight method I get:

228489, 219803, 156365, 100668, 57666, 28198, 11791, 2536, 2433, 887, 373, 166

And you can see the plot here
Hagedorm.pdf (14.5 KB)