MVA::BDT : How to disable MaxDepth

Hi, I am trying to use BDT for random forest regression. I want to disable MaxDepth option in training so that
decision trees are formed till MinNodeSize is reached at all nodes or leaf nodes are formed. (Somethong equivalent of
max_depth=None in scikit RandomForestRegressor).

Hi,

It is currently not possible to explicitly disable the BDT max depth. What you could do is to provide a very large number as the upper bound, something like 2000000000 should be reasonable.

E.g.
TString config = "...nCuts=20:MinNodeSize=10:MaxDepth=2000000000";

Hi,
Thanks for the reply.
I figured the max depth required for my case would be around 50. So I did it that way.
However, the .xml weight file written is very large, in order of GBs. I don’t know the
workaround for that, but its ok. Currently it does not pose any big problem for me.

Hi,

Glad it worked.

Having up to ~2^50 * nTrees nodes in your bdt forest will take up a large amount of space :slight_smile:

I can think of two methods to deal with this should it become a problem. One is to use the tree pruning capabilities of TMVA. I have not used them personally so I cannot comment on how effective they are.

The other would be to compress the xml. This would be for storage only as TMVA cannot read classifiers from compressed files currently.

Random forest algorithm needs the trees to be grown fully without pruning. So pruning may not be good idea.
Compressed files or conversion to binary file and then compression can be good solution for my case.
Thanks.