Mean n-D Histogram per event - possible?

Hello,

I make a histogram for each event in my analysis - say a TH1D. For one event it gets maybe 1000 entries in various bins. I want the mean of this histogram across all events.

  1. Is this possible in ROOT? I can’t find anything clearly about this.
  2. If it is possible, is there documentation on how the errors are calculated (ie event to event variance in each bin)?
  3. Is the algorithm numerically stable? I will likely analyse 10^6 to 10^8 events.
  4. Is it possible to combine 2 (or more) mean histograms (say from 2 analysis jobs) to get the overall mean as if I’d analysed all the data at once?

I’ve got my own code to do this for TH1D, TH2D and TH3Ds per event where I have only 3 histograms at once (one for rolling mean, one for rolling variance and the current event histogram), so the memory usage is independent of the number of events. I would however like to get rid of this and make use of ROOT functionality if possible - most likely this would be better optimised than my own.

I had a look at TProfile, which seems close, but it seems to fill individual values of data - so it’s not clear how to denote where you’ve stopped filling from one event and start from the next. I generally can’t find any documentation about the algorithms for the uncertainties in the bins in ROOT and if they’re numerically stable.

Any help would be greatly appreciated,
Laurie

Is GetMean() what you are looking for ? may be not … I am sure @moneta can help you.

Hello, thank you but I believe this is just the mean along one axis of a histogram. I want an average TH1D across all events - so it has the same structure (binning and number of dimensions) as a TH1D histogram that I would fill for each event. In the result histogram each bin would be the mean of that bin across all events and the error based on the event-to-event variance.

Create a “global” histogram right in the beginning (make sure you call TH1::Sumw2 before filling). Fill it in a loop over all events. Afterwards, TH1::Scale it by 1/total_number_of_events.

This will give the mean value ok, but the errors will be wrong - proportional to 1/sqrt(N entries in that bin) not 1 / sqrt(N events). Also, the variance wouldn’t be per event but just the variance in that bin across all events, no? Without filling per-event, there’s no way I can see to get the event-to-event variance correctly.

As an example, if one bin has the value (count) 650, 632, 702 in events 0,1,2, I’d expect the mean histogram bin value to be 661.33 ± 17.13 (mean ± std / sqrt(3) ). If I did this in one histogram I’d get 1984 ± 1.0/sqrt(1984) -> scaled by 1/(N events) = 1/3 -> correct mean but wrong error. Assuming no weights for simplicity.

I have no idea what you’re trying to achieve.
Try:

{
  TH1D *h0 = new TH1D("h0", "h0", 1, -0.5, 0.5);
  h0->Sumw2(kTRUE);
  h0->Fill(0., 650);
  h0->Fill(0., 632);
  h0->Fill(0., 702);
  h0->Print("range");
  h0->Scale(1./3.);
  h0->Print("range");
  TH1D *h1 = new TH1D("h1", "h1", 1, -0.5, 0.5);
  h1->Sumw2(kTRUE);
  for (int i = 0; i < 650; i++) h1->Fill(0.);
  for (int i = 0; i < 632; i++) h1->Fill(0.);
  for (int i = 0; i < 702; i++) h1->Fill(0.);
  h1->Print("range");
  h1->Scale(1./3.);
  h1->Print("range");
}

Hi,

I guess what you are looking for is the TProfile class, which computes the averages and their variance (std) for each bin

See https://root.cern.ch/doc/master/classTProfile.html

Lorenzo

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.