Gradient Boost BDT: Decision or regression trees?

JanEric · January 24, 2024, 9:09am

Hi,

i am currently reading the TMVA User Guide and have a question regarding the gradient boosting of BDTs.

In a gradient boosted BDT, are the individual trees that make up the final BDT actual decision trees or are they regression trees in there. I am particularly unsure because of this sentence on page 68:

This is done by calculating the current gradient of the loss function and then growing a regression tree whose leaf values are adjusted to match the mean value of the gradient in each region defined by the tree structure. Iterating this procedure yields the desired set of decision trees which minimises the loss function.

Looking at the code I also could not find a clear indication whether each individual tree only has output of ±1 (decision tree) or if any value in between is also possible (decision tree).

Cheers
Jan-Eric

bellenot · January 24, 2024, 9:11am

Maybe @moneta can help

moneta · January 24, 2024, 1:53pm

Hello

In TMVA Gradient Boosted BDT can be used both in classification problem (decision trees) or regression.

Cheers

Lorenzo

JanEric · January 24, 2024, 3:19pm

Hi Lorenzo,

thanks for the quick reply. We are training BDTs for classification, meaning that for the training input every event is labeled as either signal or background.

Obviously, the resulting final BDT score for any event can end up anywhere from -1 to 1. I was just wondering of in this case all of the individual tree in the BDT are decision tree or if they are regression tree regardless when using gradient boosting. I was just a bit confused because of the explicit mention of regression trees directly followed by the general decision tree term.

Cheers
Jan-Eric