Different ways of normalizing histograms

Hi everyone,

I’m a new user of both ROOT and RootTalk so please pardon my ignorance. I’ve been trying to normalize several histograms, and when I search in RootTalk, a multitude of ways comes up. How do I know which one is “right”? What is the difference between the following methods?

Here are the 4 methods I tried:

Method 1:

Double_t num = h->GetBinContent(i);
Double_t den = h->GetBinWidth(i);
Double_t value = 0;
if (den!=0)
  {
     value = num/den;
     h->SetBinContent(i,value);
  }

Method 2:

Double_t factor = 1.;
h->Scale(factor/h->GetEntries());

Method 3:

Double_t scale = h->GetXaxis()->GetBinWidth(1)/(h->Integral());
h->Scale(scale);

Method 4:

Double_t factor = 1.;
h->Scale(factor/h->Integral());
3 Likes

You can also have:

Method 5:

Double_t factor = 1.;
h->Scale(factor, "width");

Method 6:

Double_t factor = 1.;
h->Scale(factor/h->Integral(), "width");

Method 7:

Double_t factor = 1.;
h->Scale(factor/h->Integral("width"));

See: TH1::Scale (Double_t c1 = 1, Option_t* option = “”) and TH1::Integral (Option_t* option = “”)
Note that TH1::SetBinContent will change the bin content of a given bin only (and it will increment the number of entries of the histogram) and you should then explicitly TH1::SetBinError as well.
Also, do remember that you may have “fix or variable bin size” histograms (and so, in the latter case, the “Method 6” and “Method 7”, for example, will give different results).

Try:

root [0] TH1F *h = new TH1F("h","a trial histogram", 100, -1.5, 1.5);
root [1] h->Sumw2();
root [2] for (Int_t i = 0; i < 10000; i++) h->Fill(gRandom->Gaus(0, 1));
root [3] h->Draw();
root [4] h->GetEntries()
root [5] h->Integral()
root [6] h->Integral("width")

Then try to:

TH1F *h1 = (TH1F*)(h->Clone("h1"));
// ... one "Clone" per "Method"
TH1F *h7 = (TH1F*)(h->Clone("h7"));

and then apply each of your “Methods” (1 … 7) to these clones (“h1” … “h7”) and compare the results.
Note: the “h” is a “fix bin size” histogram" so some “Methods” that utilize the “bin width” may give the same results, while for a “variable bin size” histogram you would get different results.

From all these different methods, two which utilize TH1::Scale and TH1::Integral are usually the most relevant:

  • normalize by the “integral” to show the frequency probability in each bin (“Method 4”),
  • normalize by the “integral * bin width” to show the estimated probability density function (“Method 6”).

There is also the question of underflow / overflow bins.
When normalizing by the “integral”, we could also include underflow / overflow bins in the “integral”, while we cannot do it in the second case, because we don’t know the underflow / overflow “bin width”.

BTW. In order to make sure that the errors are properly handled, first (i.e. before calling TH1::Scale) execute:

if (h->GetSumw2N() == 0) h->Sumw2(kTRUE);
8 Likes

Thank you, I’ll try these methods and see the differences.

Should we not redraw the histogram after normalizing.

After

myHist->Scale(/*any number here even 1*/);

draw style changes. Error bars appear instead of simple lines. How to avoid that?

myHist->Draw("HIST")

You can check out the 1D histogram draw options here: https://root.cern.ch/doc/master/classTHistPainter.html#HP01a

2 Likes

Thank you. It worked.

Thanks alot. you rocks… :grinning:

how do work scale function with draw line in place of histo in root version 6.10 while in root version 5.34 , it’s worked. I have given below command in my code-
h11->Scale(1.2); here 1.2 as a scaling factor.
h11->Draw(“same:l”); in case does not work in 6 version . if h11->Draw (“histo”); then worked in version 6. while in version 5, both worked.
Please give suggestion.

Hi, I come here because I am having troubles with normalisation of 3 histograms. I am using the following method:

h_Signal->Scale(1./h_Signal->Integral());
h_Lambda->Scale(1./h_Lambda->Integral());
h_Ks0-> Scale(1./h_Ks0-> Integral());

but then when printing the histograms I see that the areas under each are not the same. You can find the plot here attached. Has someone faced the same doubt before?

Thanks in advance.

Cheers,

Diego

Hi,

something else is probably going on here, it shouldn’t be like this. Do you mind sharing your entire code (including the files with the input histograms) so that we could have a look at this particular case?

Sure, the code and the root file I use to read the histograms are attached. Apologies for the code format, I work in SWAN so I use the notebooks instead of running macros.

Thanks!

work_with_hists.txt (2.8 KB)
PrCheckerPlots_newconds.root (1.3 MB)

hs->Draw("NOSTACK");