How are sensitivity and specificity computed in TMVA::ROCCurve?

Str55 · December 19, 2020, 8:11pm

Dear experts,

I’m a little confused how specificity and sensitivity are computed in TMVA::ROCCurve.

I was under the assumption that they would be determined by looping over a bunch of operating points and finding sEff and bRej at each point. However, these functions ( Sensitivity, Specificity ) appear to do something different.

When I implemented my own roc curve by calculating sEff and bRej for my BDT at 1000 evenly spaced operating points between -1 and 1, I had a very different result.

Thank you!
-Steve

bellenot · December 21, 2020, 3:53pm

Maybe @moneta can give some hints

moneta · January 4, 2021, 4:56pm

Hi,

The two functions are internal functions and compute the sensitivity (signal efficiency) and specificity (background rejection) at every signal (and background) output points.
The obtained values are then plotted to make the ROC curve as a TGraph with x = sensitivity values and y = specifity values. See TMVA::GetROCCurve

You should get very similar results as using uniform binning in the two variables. If not and you did not find any error in your code, please post the macro producing the above plot and I can verify it

Cheers

Lorenzo

Str55 · January 4, 2021, 10:23pm

Hi Lorenzo,

I see the issue.

This is correct, and looking at the code of TMVA::GetROCCurve confirms that. However, the Doxygen documentation for that function says the opposite:
x = specificity
y = sensitivity

That is what I was going off. Sure enough, my roc-making function lines up perfectly with TMVA::ROCCurve when I flip the axes to match what they should be.

Thanks for the help! I’ve submitted a pull request on the root github to fix the docu.

Cheers,
Steve

moneta · January 5, 2021, 8:27am

Hi Steve,

Thank you for submitting the PR and noting this mistake. I have now merged your changes

Cheers

Lorenzo