Dear TMVA experts,
I’m trying to run the BDT classification with 12 variables. In a subsample of events, 3 out of the 12 variables are missing and are assigned a default value = -9999 no matters if it is a signal or a background event. Even if the BDT looks working pretty nicely, it seems to me a perfect case where the usage of the category method could improve the classification (or at least I would expect a non-worse behavior).
I tried to implement the classification with the category method but I’m a bit perplex about the results. I attach here the macro I used for the training, the output text file and the distributions of the BDT output classifier and a comparison of the ROC-curves. In particular the two-peak shape of the classifier distribution is quite worrisome to me.
I’m wondering if I’m doing something wrong and in case how to implement correctly the procedure.
Please let me know if I have to provide other information. The trees I’m using are quite huge but if needed I can select a subsample to test the macro.
Thank you very much.
TMVAClassificationCategory.C (13.1 KB)
TMVACategory_output_0_1_12.txt (49.0 KB)