Severe root version dependence on running time of BDT training

willemv · April 23, 2018, 2:24pm

Dear experts,

I am attempting to train a BDT to distinguish signal from background in my analysis, and I noticed something peculiar when switching from root version 6.10.09 to 6,12.07, namely that the running time of the training, using the same configuration and input tree varies by a factor of roughly 5 (~25 minutes in 6.12 and less than 5 minutes in 6.10, and this happens consistently). Not only that, but the training in 6.12.07 tends to suffer much more from overtraining than that in 6.10.09. The BDT I trained used the following configuration:

factory->BookMethod( dataloader, TMVA::Types::kBDT, “BDTG_200Cuts_Depth4_baggedGrad_1000trees_shrinkage0p1”, “!H:!V:NTrees=1000:MinNodeSize=5%:BoostType=Grad:Shrinkage=0.1:nCuts=200:MaxDepth=4:IgnoreNegWeightsInTraining:UseBaggedGrad=True:DoBoostMonitor=True”);

Does anyone know what might cause this? With very different results, should I trust the 6.10 or 6.12 training more? Thanks in advance.

regards,

Willem

kialbert · April 24, 2018, 11:44am

Hi,

I am not aware of any changes that should impact the running time.

For the overtraining sensitivity, there was a change in how the gradient boosting is calculated. This will lead to differences when you have either a multi-class setting or very unbalanced classes.

Are you using the factory option “!DrawProgressBar”? If not, then try adding it.

It would be good for us to understand the situation better, would it be possible for you to share your training script, with data, or a similar setup that reproduces your problem?

Cheers,
Kim

willemv · April 26, 2018, 1:10pm

Hi Kim,

Thanks for the quick reply. What do you mean exactly by “very unbalanced classes”? To give an example of the difference in overtraining, I attached the output (training and test samples overlayed) for the two different ROOT versions I used. I also included the ROOT files TMVA outputs, and the code I used for the training. All of it can be found on:

https://wverbeke.web.cern.ch/wverbeke/TMVATest/

I am not sure I can share the trees I used for the training since these are based on central CMS simulations. You can see that in the 6.12 plot, the BDT output is essentially flat in the test sample, so it is completely overtrained, while there is reasonable agreement (with a little overtraining) in version 6.10. From these plots it seems that the 6.10 training performs much better with the same settings than the 6.12 training. The option “DrawProgressBar” was set to true, but the timing it displayed seemed to be correct since the 6.12 training took noticeably longer. I hope this info helps.

regards,

Willem

willemv · April 26, 2018, 1:13pm

I also have one small additional question. Do you know why the output range of the BDT is [-0.5, 1] in version 6.10 in this case and [-1,1] in 6.12? I would like to have the output always in the range [-1,1], is there an easy way inside TMVA to set the output range ?

kialbert · May 10, 2018, 6:55pm

Hi,

Sorry for the delayed reply. I will hopefully have time to look into this problem in a bit more detail this weekend.

I don’t need the exact trees, only a training script that will reproduce the observed problem.

With unbalanced classes I mean having a e.g. the signal class using 500 raw events and the background class using 100000.

Did you try running the code with “!DrawProgressBar”? There was a bug discovered which could make the running time when using this be unnecessarily long.

The output of the BDT’s in TMVA should always have [-1, 1] as output range. In 6.10 the BDT could have had problems converging on the class labels for background, thus limiting the output.

Cheers,
Kim

kialbert · May 14, 2018, 9:35am

Hi,

In the end I did not have time to look into this during the week end. Will try to find some time during the week.

Cheers,
Kim

kialbert · June 8, 2018, 4:56pm

Hi,

Sorry for the extreme delay on this.

It seems the input files are missing. Could you add these?

Cheers,
Kim