Mean of histograms in loop PyROOT

Hi sorry for the simple question, but I haven’t been able to figure out the solution for this issue yet.
I have some data from ROOT files that I have split up and grouped into several different TChains using dictionaries in Python. I want to find the mean of a variable in those different TChains. For example by doing

his = ROOT.TH1F()           

for i in range(1,4):
    chain_dict["chain_%s"% (i)].Draw("variable>>his") 
    mean_value = his.GetMean()

where each object in chain_dict is a TChain of several chained ROOT files, and variable is the variable I want to get the mean of (with the name changed on here for brevity).

However rather than printing the mean as I expect, instead 0 is printed. This is also the case if one of the chains is examined outside the loop, even though the Draw part of the code works fine and produces a canvas with the correct plot and histogram.

A way I tried to fix this was by changing the code as below

for i in range(1,4):
    chain_dict["chain_%s"% (i)].Draw("variable>>his")
    his = ROOT.gDirectory.Get("his")
    mean_value = his.GetMean()
    del his

However in this case instead of printing 0's, the first mean printed is correct, the second is too small, and subsequent means are all printed as 0. Although here the correct mean is printed for all chains if the process is done individually outside the loop. This is true when getting any value from the histogram, for example his.Integral() and others.

I assume there is something simple I have missed that is causing this issue but since I can’t work out what it is any help would be much apprecieted.

Hi @RJN,
hard to say what’s wrong, you could e.g. draw the histograms to check what values they were filled with. Also I’m not sure it’s ok to Draw a histogram with the same name like that.

Maybe I can suggest a simpler way to get the mean of the values of your variable, however. With RDataFrame it should be just:

for i in range(1,4):

or simple variations thereof.

Hi Enrico,

Thanks for the response! I fixed the issue yesterday evening but forgot to update this topic. What was happening was once the histogram was defined the first time around in the line his = ROOT.gDirectory.Get("his") the histogram got in some way fixed and for some reason del his or ROOT.his.Reset() didn’t seem to get rid of it. Meaning on the following iteration of the loop the histogram wasn’t filled correctly.

I fixed it by creating a list of strings of the form "variable>>his1" etc called hist_string_list and an empty dictionary called hist_dict then used the code below

for i in range(1,4):

    chain_dict["chain_%s"% (i)].Draw(hist_string_list[i-1])
    hist_dict["his%s"% (i)] = ROOT.gDirectory.Get("his"+str(i))
    mean_value = hist_dict["his%s"% (i)].GetMean()

This may not be the optimised way of doing it (in fact I’d be amazed if it was) and perhaps using RDataFrame would be better, but for the time being this method works for me so I’ll keep using it.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.