ROOT calculates histogram statistics in different ways depending on the context even though the function calls are the same.
As noted in this JIRA task, after initially filling the histogram, the statistics of the histogram are that of the dataset, not the histogram. If one zooms in on an axis, however, it returns the mean of the binned values. There are a few problems with this behavior:
- It is not documented. (This was noted on the JIRA task, which was nonetheless marked as resolved…)
- It is inconsistent. Zoomed histograms get histogram stats (even if the entire axis is in range), while unzoomed ones get dataset stats.
- It is wrong. The mean of the histogram is the binned mean, not the mean of the dataset, no matter when it was filled.
Regarding (3), one gets the correct behavior if one calls
TH1::ResetStats after filling, but a user would only know about this if they specifically looked up
ResetStats–it is not mentioned in the documentation for
TH1::GetStdDev, nor for any other function call, nor is it mentioned in the Users Guide chapter on histograms.
One can recover the statistics of the histogram if one calls
TAxis::UnZoom, but this only works if the histogram has been drawn–otherwise, the
UnZoom call does nothing. This behavior is also unintuitive, since naively, one would expect histogram calculations to be independent of whether they were shown on a canvas. This behavior is not documented, and understanding it requires examining the source code. (Without a canvas, one can call
TAxis::SetRange(0, 0) instead.)
Changing the default behavior would obviously break backward compatibility, so I suggest two changes:
- Better documentation of this behavior, including up top in the
TH1class reference, the Users Guide, and all functions that get histogram stats.
TAxis::UnZoomshould work even when there is no canvas or at least print a warning when it doesn’t do anything, and the documentation should be updated accordingly…
I’m happy to submit a PR with these changes if desired. Reproducer below.
import ROOT as r h = r.TH1I('h', 'h', 1, 0, 100) # histogram with just 1 bin for i in range(1000): h.Fill(r.gRandom.Gaus(20, 2)) # stats of dataset: print(h.GetMean(), h.GetStdDev()) # (20.053602822081633, 2.075704758987478) # stats of histogram: h.GetXaxis().SetRangeUser(0, 1) print(h.GetMean(), h.GetStdDev()) # (50.0, 0.0) # still stats of histogram despite being all the way zoomed out: h.GetXaxis().SetRangeUser(0, 100) print(h.GetMean(), h.GetStdDev()) # (50.0, 0.0) # UnZoom does nothing: h.GetXaxis().UnZoom() print(h.GetMean(), h.GetStdDev()) # (50.0, 0.0) # unless h is drawn: r.gROOT.SetBatch() c = r.TCanvas() h.Draw() h.GetXaxis().UnZoom() print(h.GetMean(), h.GetStdDev()) # (20.053602822081633, 2.075704758987478)
ROOT Version: 6.20/04
Platform: Not Provided
Compiler: Not Provided