Different ways of normalizing histograms

aafshari · February 3, 2013, 1:37am

Hi everyone,

I’m a new user of both ROOT and RootTalk so please pardon my ignorance. I’ve been trying to normalize several histograms, and when I search in RootTalk, a multitude of ways comes up. How do I know which one is “right”? What is the difference between the following methods?

Here are the 4 methods I tried:

Method 1:

Double_t num = h->GetBinContent(i);
Double_t den = h->GetBinWidth(i);
Double_t value = 0;
if (den!=0)
  {
     value = num/den;
     h->SetBinContent(i,value);
  }

Method 2:

Double_t factor = 1.;
h->Scale(factor/h->GetEntries());

Method 3:

Double_t scale = h->GetXaxis()->GetBinWidth(1)/(h->Integral());
h->Scale(scale);

Method 4:

Double_t factor = 1.;
h->Scale(factor/h->Integral());

Wile_E_Coyote · February 3, 2013, 8:22am

You can also have:

Method 5:

Double_t factor = 1.;
h->Scale(factor, "width");

Method 6:

Double_t factor = 1.;
h->Scale(factor/h->Integral(), "width");

Method 7:

Double_t factor = 1.;
h->Scale(factor/h->Integral("width"));

See: TH1::Scale (Double_t c1 = 1, Option_t* option = “”) and TH1::Integral (Option_t* option = “”)
Note that TH1::SetBinContent will change the bin content of a given bin only (and it will increment the number of entries of the histogram) and you should then explicitly TH1::SetBinError as well.
Also, do remember that you may have “fix or variable bin size” histograms (and so, in the latter case, the “Method 6” and “Method 7”, for example, will give different results).

Try:

root [0] TH1F *h = new TH1F("h","a trial histogram", 100, -1.5, 1.5);
root [1] h->Sumw2();
root [2] for (Int_t i = 0; i < 10000; i++) h->Fill(gRandom->Gaus(0, 1));
root [3] h->Draw();
root [4] h->GetEntries()
root [5] h->Integral()
root [6] h->Integral("width")

Then try to:

TH1F *h1 = (TH1F*)(h->Clone("h1"));
// ... one "Clone" per "Method"
TH1F *h7 = (TH1F*)(h->Clone("h7"));

and then apply each of your “Methods” (1 … 7) to these clones (“h1” … “h7”) and compare the results.
Note: the “h” is a “fix bin size” histogram" so some “Methods” that utilize the “bin width” may give the same results, while for a “variable bin size” histogram you would get different results.

From all these different methods, two which utilize TH1::Scale and TH1::Integral are usually the most relevant:

normalize by the “integral” to show the frequency probability in each bin (“Method 4”),
normalize by the “integral * bin width” to show the estimated probability density function (“Method 6”).

There is also the question of underflow / overflow bins.
When normalizing by the “integral”, we could also include underflow / overflow bins in the “integral”, while we cannot do it in the second case, because we don’t know the underflow / overflow “bin width”.

BTW. In order to make sure that the errors are properly handled, first (i.e. before calling TH1::Scale) execute:

if (h->GetSumw2N() == 0) h->Sumw2(kTRUE);

aafshari · February 5, 2013, 6:26am

Thank you, I’ll try these methods and see the differences.

HIMANSHU_SHARMA · July 19, 2017, 5:54am

Should we not redraw the histogram after normalizing.

p73 · December 4, 2017, 8:27am

After

myHist->Scale(/*any number here even 1*/);

draw style changes. Error bars appear instead of simple lines. How to avoid that?

ksmith · December 4, 2017, 3:14pm

myHist->Draw("HIST")

You can check out the 1D histogram draw options here: https://root.cern.ch/doc/master/classTHistPainter.html#HP01a

p73 · December 6, 2017, 4:01pm

Thank you. It worked.

wajahat · August 7, 2020, 10:39am

Thanks alot. you rocks…

Gauri_Devi · January 8, 2022, 10:02am

how do work scale function with draw line in place of histo in root version 6.10 while in root version 5.34 , it’s worked. I have given below command in my code-
h11->Scale(1.2); here 1.2 as a scaling factor.
h11->Draw(“same:l”); in case does not work in 6 version . if h11->Draw (“histo”); then worked in version 6. while in version 5, both worked.
Please give suggestion.

Wile_E_Coyote · January 8, 2022, 11:15am

mendoza · August 10, 2023, 3:23pm

Hi, I come here because I am having troubles with normalisation of 3 histograms. I am using the following method:

h_Signal->Scale(1./h_Signal->Integral());
h_Lambda->Scale(1./h_Lambda->Integral());
h_Ks0-> Scale(1./h_Ks0-> Integral());

but then when printing the histograms I see that the areas under each are not the same. You can find the plot here attached. Has someone faced the same doubt before?

Thanks in advance.

Cheers,

Diego

yus · August 10, 2023, 3:43pm

Hi,

something else is probably going on here, it shouldn’t be like this. Do you mind sharing your entire code (including the files with the input histograms) so that we could have a look at this particular case?

mendoza · August 10, 2023, 3:49pm

Sure, the code and the root file I use to read the histograms are attached. Apologies for the code format, I work in SWAN so I use the notebooks instead of running macros.

Thanks!

work_with_hists.txt (2.8 KB)
PrCheckerPlots_newconds.root (1.3 MB)

Wile_E_Coyote · August 10, 2023, 3:57pm

hs->Draw("NOSTACK");