ROOT calculates histogram statistics in different ways depending on the context even though the function calls are the same.
As noted in this JIRA task, after initially filling the histogram, the statistics of the histogram are that of the dataset, not the histogram. If one zooms in on an axis, however, it returns the mean of the binned values. There are a few problems with this behavior:
- It is not documented. (This was noted on the JIRA task, which was nonetheless marked as resolved…)
- It is inconsistent. Zoomed histograms get histogram stats (even if the entire axis is in range), while unzoomed ones get dataset stats.
- It is wrong. The mean of the histogram is the binned mean, not the mean of the dataset, no matter when it was filled.
Regarding (3), one gets the correct behavior if one calls TH1::ResetStats
after filling, but a user would only know about this if they specifically looked up ResetStats
–it is not mentioned in the documentation for TH1::GetMean
, TH1::GetStdDev
, nor for any other function call, nor is it mentioned in the Users Guide chapter on histograms.
One can recover the statistics of the histogram if one calls TAxis::UnZoom
, but this only works if the histogram has been drawn–otherwise, the UnZoom
call does nothing. This behavior is also unintuitive, since naively, one would expect histogram calculations to be independent of whether they were shown on a canvas. This behavior is not documented, and understanding it requires examining the source code. (Without a canvas, one can call TAxis::SetRange(0, 0)
instead.)
Changing the default behavior would obviously break backward compatibility, so I suggest two changes:
- Better documentation of this behavior, including up top in the
TH1
class reference, the Users Guide, and all functions that get histogram stats. -
TAxis::UnZoom
should work even when there is no canvas or at least print a warning when it doesn’t do anything, and the documentation should be updated accordingly…
I’m happy to submit a PR with these changes if desired. Reproducer below.
import ROOT as r
h = r.TH1I('h', 'h', 1, 0, 100) # histogram with just 1 bin
for i in range(1000):
h.Fill(r.gRandom.Gaus(20, 2))
# stats of dataset:
print(h.GetMean(), h.GetStdDev()) # (20.053602822081633, 2.075704758987478)
# stats of histogram:
h.GetXaxis().SetRangeUser(0, 1)
print(h.GetMean(), h.GetStdDev()) # (50.0, 0.0)
# still stats of histogram despite being all the way zoomed out:
h.GetXaxis().SetRangeUser(0, 100)
print(h.GetMean(), h.GetStdDev()) # (50.0, 0.0)
# UnZoom does nothing:
h.GetXaxis().UnZoom()
print(h.GetMean(), h.GetStdDev()) # (50.0, 0.0)
# unless h is drawn:
r.gROOT.SetBatch()
c = r.TCanvas()
h.Draw()
h.GetXaxis().UnZoom()
print(h.GetMean(), h.GetStdDev()) # (20.053602822081633, 2.075704758987478)
ROOT Version: 6.20/04
Platform: Not Provided
Compiler: Not Provided