Understanding the normalization setting for and behind the BDT

kialbert · February 1, 2019, 12:19pm

Hi,

As far as I understand, setting the normalisation done for BDT’s (SkipNormalization=False) should be very similar to the dataloader normalisation Norm=EqualNumEvents when SigToBkgFraction=1.0. Both of these renormalise the training data to be of equal importance to force the classifier to make an “interesting” decision, as compared to just outputting the class with the largest global probability.

A quick test with the TMVAClassification example yields that the worst result is gotten with SkipNormalization and NormMode=None.

To specifically answer your question. Disabling the internal BDT normalisation is completely fine.

The internal normalisation is an optimisation that was, to my understanding, developed for AdaBoost. Since you use gradient boosting you can choose whichever normalisation you want.

Cheers,
Kim