Performance of binary projection of Multiclass classifier (BDTG)

Hi TMVA experts,

I am playing around with the TMVAMulticlass analysis mode (disclaimer: I have no experience with it), training a BDTG to separate four classes of events : e, mu, pi, K.

If I check the classifier response and ROC curve for a specific “binary projection”, say “e VS pi”, I am getting substantially worse performance than using the same BDTG trained w/ the normal binary Classification mode.
In both cases, the e and pi input samples contain the same number of events, and the training/testing splitting is done w/ same proportions (80% train, 20% test).

Naively I would have expected to get same results…why is not so? See the attached image.

Thanks for your help.

Marco

Hi,

We are happy to help with any questions you might have!

I think a key realisation in reasoning about multiclass output is that the axis-aligned 1D projections simplify the original space too much. The output of the classifier (classifier-space) is, in your case, 3 dimensional (assuming the probabilities sum to 1 :wink: ). In the binary classification case the output space is 1D.

When constructing the ROC curve one has to iterate over all possible (connected) partitions of the classifier-space and record the achievable efficiences. For the 1D case this means you only need to consider point cuts while in the 2D case you must iterate over all 1-d curves cutting the space. For 3D one needs to consider 2D-curves. A simple, but effective, approximation is considering D simultaneous point cuts, one for each axis.

Currently TMVA does not support this out-of-the-box (but might in the future) so you would have to do this optimisation in your own code. I could provide some guidance here should you need it.

Cheers,
Kim