Error fraction plot in BDT method

Dear TMVA users,

after running a BDT analysis on some data, I found a plot showing me the error fraction as a function of the number of trees. As the number of trees increases, the error fraction increases as well until a certain ‘number of trees’ value and reaches a plateau.

I wanted to know how this error fraction is defined. I can’t seem to find any good information in TMVA user guide (which is a very recurrent issue I must say…) or anywhere else for that matter.

Thank you very much.

Cheers,
K.


ROOT Version: 6.12/04
Platform: Ubuntu 16.04
Compiler: C/C++ 5.18.00


The error fraction plot shows the number of misclassified event for that particular tree (not the ensemble). It is expected that this number converges to 0.5, as then the classifier cannot extract anymore information from the data.

Note: For the purpose of the plot, events are compared with a cut at BDT output 0 and are considered misclassified if signs of predicted class and target class mismatch.

We are well aware of the lack of documentation. If you have concrete suggestions of improvement and/or opinions on which areas of documentation should be improved first, please get in touch with us!

Cheers,
Kim

Dear Kim,

thank you very much for your reply, it is much more clear thanks to your explanation. It seems to be a great tool to see how many trees to use and therefore avoid overtraining, if I am not mistaken.

Regarding the documentation, I understand that it might not be an easy thing to do considering the high number of available methods. Luckily, people (like you) are always very helpful and it’s a wonderful community to discuss with. I could eventually suggest to put more comments in the .cpp function files for the functions that are not solely connected to machine learning in general. For example, I don’t think it is necessary to describe the BDT response plot as it is something fairly understandable when one knows a little bit about machine learning but plots like error fraction vs number of trees could use some more description as it is very specific to the BDT method.

Thanks again for your help.

Cheers,
Kévin

1 Like

Hi Kévin,

Thanks for your kind words. We are trying to continuously improve the documentation in all of the .C/.cpp/.h/.hxx files and also the User’s Guide :slight_smile:

Cheers,
Kim

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.