I am playing around with the TMVAMulticlass analysis mode (disclaimer: I have no experience with it), training a BDTG to separate four classes of events : e, mu, pi, K.
If I check the classifier response and ROC curve for a specific “binary projection”, say “e VS pi”, I am getting substantially worse performance than using the same BDTG trained w/ the normal binary Classification mode.
In both cases, the e and pi input samples contain the same number of events, and the training/testing splitting is done w/ same proportions (80% train, 20% test).
Naively I would have expected to get same results…why is not so? See the attached image.
We are happy to help with any questions you might have!
I think a key realisation in reasoning about multiclass output is that the axis-aligned 1D projections simplify the original space too much. The output of the classifier (classifier-space) is, in your case, 3 dimensional (assuming the probabilities sum to 1 ). In the binary classification case the output space is 1D.
When constructing the ROC curve one has to iterate over all possible (connected) partitions of the classifier-space and record the achievable efficiences. For the 1D case this means you only need to consider point cuts while in the 2D case you must iterate over all 1-d curves cutting the space. For 3D one needs to consider 2D-curves. A simple, but effective, approximation is considering D simultaneous point cuts, one for each axis.
Currently TMVA does not support this out-of-the-box (but might in the future) so you would have to do this optimisation in your own code. I could provide some guidance here should you need it.