TEfficiency or Divide(..."B") for SF files

adrobac · September 9, 2021, 2:22pm

Hi there,

I’m making efficiency scale factors (SF), where the SF is the ratio of the data efficiency to the simulated efficiency, and I need to calculate the overall stat uncertainty on these SFs. The workflow (in pseudocode) right now is

data_efficiency_TH2 = data_matches_TH2.Divide(data_total_TH2)
mc_efficiency_TH2 = mc_matches_TH2.Divide(mc_total_TH2)
SF_TH2 = data.efficiency_TH2.Divide(mc_efficiency_TH2)

My understanding, though, is that just using this Divide() function doesn’t properly do binomial error, and that I should either use Divide(hist1, hist2, 1, 1, “B”) or create TEfficiency objects and use those (though I am unfamiliar with TEfficiency and am not sure how to access the bin error once the objects are created). It’s unclear to me from looking at previous forums which is the recommended approach. Any suggestions?

Thanks!

ROOT version: 6.22.00

jalopezg · September 9, 2021, 10:47pm

Hi @adrobac; I am sure @couet can help you with this.

Cheers,
J.

couet · September 10, 2021, 7:01am

I think @moneta will know better than me.

adrobac · September 16, 2021, 3:39pm

Thanks @jalopezg and @couet for your replies. I’m just replying as well to refresh this question in the topics list.

adrobac · September 23, 2021, 1:29pm

@moneta ? Sorry to bother you again, but I could really use some help with this.

moneta · September 23, 2021, 2:23pm

Hi,
For proper error estimation on the efficiency you should use the TEfficiency class passing the total and pass histograms. The class will compute correctly the errors in the efficiency providing different statistical methods for doing it. It includes also the case of TH1::Divide(.."B") .
See the reference documentation of TEfficiency. There should be there enough information to learn on how to use the class.
Apologies for the late reply.

Lorenzo

adrobac · September 23, 2021, 3:37pm

Thank you for the reply @moneta! I think I understand what I should do now, but I just wanted to confirm that the default options are sufficient:

First, if I do
dataHist = TEfficiency(dataMatchHists,dataProbeHist).GetCopyTotalHisto(),
and likewise for MC, I get TH2’s with properly-calculated binominal errors.

Then, for the SF, which is Data eff / MC eff, I believe that since they are uncorrelated errors I can simply take the hists from above and do
SFHist = dataHist.Divide(MCHist),
which will propagate the data and MC hist errors in quadrature.

Is this the correct way to go about things, to your understanding?

moneta · September 23, 2021, 4:36pm

Hi,

To get the properly calculated binomial errors, you need to use directly the TEfficiency class…
You can draw it or get their bin content and errors calling
TEfficiency::GetEfficiency(bin) and TEfficiency::GetEfficiencyErrorLow(bin), TEfficiency::GetEfficiencyErrorUp(bin).
After having drawn it (calling TEfficiency::Draw()) , you can get the corresponding Graph (for the 1D case), by calling TEfficiency::GetPaintedGraph() and TEfficiency::GetPaintedHistogram() for the 2-d case as yours.

For uncorrelated histograms you can call TH1::Divide, but if you want proper Poisson errors ratio error, you can use (for 1D case only), TGraphAsymmErrors::Divide(h1,h2, "pois"), see ROOT: TGraphAsymmErrors Class Reference

adrobac · September 24, 2021, 3:35pm

I think this is all that I need; thank you for the help!