Home | News | Documentation | Download

Different ways of normalizing histograms

Hi everyone,

I’m a new user of both ROOT and RootTalk so please pardon my ignorance. I’ve been trying to normalize several histograms, and when I search in RootTalk, a multitude of ways comes up. How do I know which one is “right”? What is the difference between the following methods?

Here are the 4 methods I tried:

Method 1:

Double_t num = h->GetBinContent(i);
Double_t den = h->GetBinWidth(i);
Double_t value = 0;
if (den!=0)
  {
     value = num/den;
     h->SetBinContent(i,value);
  }

Method 2:

Double_t norm = h->GetEntries();
h->Scale(1/norm);

Method 3:

Double_t scale = h->GetXaxis()->GetBinWidth(1)/(h->GetIntegral());
h->Scale(scale);

Method 4:

Double_t norm = 1;
Double_t scale = norm/(h->Integral());
h->Scale(scale);
1 Like

You can also have:

Method 5:

Double_t norm = 1;
h->Scale(norm, "width");

Method 6:

Double_t norm = 1;
h->Scale(norm/h->Integral(), "width");

Method 7:

Double_t norm = 1;
h->Scale(norm/h->Integral("width"));

See: TH1::Scale (Double_t c1 = 1, Option_t* option = “”) and TH1::Integral (Option_t* option = “”)
Note that TH1::SetBinContent will change the bin content of a given bin only (and it will increment the number of entries of the histogram) and you should then explicitly TH1::SetBinError as well.
Also, do remember that you may have “fix or variable bin size” histograms (and so, in the latter case, the “Method 6” and “Method 7”, for example, will give different results).

Try:

root [0] TH1F *h = new TH1F("h","a trial histogram", 100, -1.5, 1.5)
root [1] h->Sumw2()
root [2] for (Int_t i = 0; i < 10000; i++) h->Fill(gRandom->Gaus(0, 1))
root [3] h->Draw()
root [4] h->GetEntries()
root [5] h->Integral()
root [6] h->Integral("width")

Then try to:

TH1F *h1 = (TH1F*)(h->Clone("h1"));
// ... one "Clone" per "Method"
TH1F *h7 = (TH1F*)(h->Clone("h7"));

and then apply each of your “Methods” (1 … 7) to these clones (“h1” … “h7”) and compare the results.
Note: the “h” is a “fix bin size” histogram" so some “Methods” that utilize the “bin width” may give the same results, while for a “variable bin size” histogram you would get different results.

From all these different methods, two which utilize TH1::Scale and TH1::Integral are the most relevant:

  • normalize by the “integral” to show the frequency probability in each bin,
  • normalize by the “integral * bin width” to show the estimated probability density function.

There is also the question of underflow / overflow bins.
When normalizing by the “integral”, we could also include underflow / overflow bins in the “integral”, while we cannot do it in the second case, because we don’t know the underflow / overflow “bin width”.

BTW. In order to make sure that the errors are properly handled, first (i.e. before calling TH1::Scale) execute:

if (h->GetSumw2N() == 0) h->Sumw2(kTRUE);
2 Likes

Thank you, I’ll try these methods and see the differences.

Should we not redraw the histogram after normalizing.

After

myHist->Scale(/*any number here even 1*/);

draw style changes. Error bars appear instead of simple lines. How to avoid that?

myHist->Draw("HIST")

You can check out the 1D histogram draw options here: https://root.cern.ch/doc/master/classTHistPainter.html#HP01a

Thank you. It worked.