Calculating interval with x% of events


Dear experts,

Given a distribution, I want to find the smallest interval containing 68% of events.
I start with creating a Cumulative distribution from the existing TH1.

inFile = 'newCatalog_30Sep2019/hadd_tree/signal_m_60.root'
chain = ROOT.TChain('h4gCandidateDumper/trees/SUSYGluGluToHToAA_AToGG_M_60_TuneCUETP8M1_13TeV_pythia8_13TeV_4photons')
chain.Add(inFile)

h = ROOT.TH1F("h",";Avg. Diphoton Mass[GeV];# of events",100, 0,100 )
chain.Draw('avg_dp_mass >> h')
h_cumulative = h.GetCumulative()

Each bin in h_cumulative is a sum of the number of entries in that bin and all the previous bins. I was thinking that by looping over all bins, I can find the difference between bin contents of consecutive bins until the difference is close to 0.68. But I think there is some fallacy in this logic. Could some one please tell me how I can proceed to solve this problem?

Thanks in advance,
Tanvi


Hi,
the problem of your method is that you you took the difference between two consecutive bins
If you want to have a precise estimation of 68% CL, you need a very large number of bins, but in this way this difference between two consecutive bins will approach to zero.

For what you need the thing is more complex, for each bin i of the cumulative histogram you have to compute the difference between the bin i and all the next bins, starting from the bin 0. In this way you will find, for each bin of the cumulative distribution, which is the index jassociate to iso that the the area between bin_ i and bin j is the 68%.
Then you should find the minimum between all the differences and this is what you want.

I hope you understand it.

Cheers,
Stefano

PS
Maybe if you want to boost the program, instead of a simple loop, is better to start the search where the steepness of the cumulative distribution is larger.

Hi Stefano,

Thanks for the suggestion.
Do you mean implementing something like this:

inFile = '/newCatalog_30Sep2019/hadd_tree/signal_m_60.root'
chain = ROOT.TChain('h4gCandidateDumper/trees/SUSYGluGluToHToAA_AToGG_M_60_TuneCUETP8M1_13TeV_pythia8_13TeV_4photons')
chain.Add(inFile)

h = ROOT.TH1F("h",";Avg. Diphoton Mass[GeV];# of events",100, 0,100 )
chain.Draw('avg_dp_mass >> h')

h_cumulative = h.GetCumulative()

n_bins = h_cumulative.GetNbinsX()
diff_values = []
for i in range(0,n_bins):
    for j in range(i,n_bins):
        diff =  h_cumulative.GetBinContent(i)-h_cumulative.GetBinContent(j)
        if (abs(diff) < 0.68):
            diff_values.append(abs(diff))

and then finding the min of diff_values and finding the corresponding x-values?

Yes the idea is this, everything seems more or less correct to me.
Pay attention to value of i close n_bins there every interval is less than 0.68, and the risk is to take an interval with a probability way smaller than 0.68 and a very small interval.

S

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.