Any idea to allow TEfficiency to accept bins where pass > total?

yumengcao4888 · July 17, 2021, 4:21pm

Dear ROOTers,

Recently I’ve run into an issue with TEfficiency. Our team would like to accept the case where the “pass” histogram has some bins greater than the “total” histogram. (See TEfficiency::Divide() here: ROOT: hist/hist/src/TGraphAsymmErrors.cxx Source File) There is a check in this function to forbid this case. (ROOT: hist/hist/src/TEfficiency.cxx Source File)

Does anyone know how to turn off this check?

Thanks,

ROOT Version: 6.22/08
Platform: lxplus
Compiler: gcc I believe

jalopezg · July 18, 2021, 10:35pm

Hi @yumengcao4888,

As you mentioned in your post, that check is hardcoded. Therefore, the only thing I can suggest is building ROOT from source, e.g. for the latest stable version GitHub - root-project/root at latest-stable.
Instructions on how to build from source are available here: Building ROOT from source - ROOT.

Be sure to remove the lines corresponding to the check before building.

Cheers,
J.

yumengcao4888 · July 19, 2021, 5:05pm

Thanks @jalopezg, actually I’m running this with athena release 21.9.16,Athena on lxplus. So I guess this would not work somehow…

eguiraud · July 19, 2021, 8:33pm

Hi @yumengcao4888 ,
and welcome to the ROOT forum.

Maybe you can manually set the overflowing bins in the “passed” histo to the max bin content in the “total” histo before computing the efficiencies? @moneta might also be able to suggest a workaround.

yumengcao4888 · July 19, 2021, 8:47pm

Thanks @eguiraud,

Nice thought! That is exactly what I did earlier. But my team leader doesn’t seem to like this idea that much.

eguiraud · July 20, 2021, 7:40am

I think at this point it’s really a question of why you have more “passed” than total (it does not make sense in first approximation) and what makes sense in terms of statistical analysis in your case. TEfficiency has that check because typically passed > total means there is a bug somewhere.

yumengcao4888 · July 20, 2021, 9:44pm

No there’s no bug. In our case, we have 3 types of information: truth level, offline reconstructed level, and runtime reconstructed level, and we loop over truth level to see if there’s a match at offline and runtime reconstructed level respectively.

In these 2 cases (offline/truth and runtime/truth), all “passed” are no larger than total. But we are considering a 3rd efficiency (runtime/offline). In this case, most “passed” are no larger than total, but there might be some bins where the info gets reconstructed at runtime but not offline.

eguiraud · July 21, 2021, 7:44am

Sorry I did not mean to imply that your code has a bug, just that that’s the assumption behind TEfficiency.

moneta · July 21, 2021, 1:33pm

Hi,

TEfficiency should be use for case where the statistics is binomial. Your case is probably more complicated, when making a ratio like offline/truth you might have fake (noisy) contribution in the online case that do not originate from truth.
The case runtime/offline is even more complicated and probably require a dedicated modelling of the runtime/offline counts.
Note that the case where you can consider the two contributions (numerator and denominator) uncorrelated, the so called Poisson ratio, it is covered in ROOT by using the function TGraphAsymmErrors::Divide with the option pois.

Lorenzo

yumengcao4888 · July 22, 2021, 2:21pm

Hi @moneta
I see. Thank you Lorenzo! I’ll give a try.

yumengcao4888 · July 22, 2021, 2:25pm

I see. Thanks, @eguiraud.

system · August 5, 2021, 2:26pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.