I’ve a fairly simple question. I’m currently training a BDT with the following settings:
I noticed in the TMVA manual that GradBaggingFraction is deprecated, and I should instead switch to BaggedSampleFraction. In doing so, however, there’s a noticeable increase in the amount of overtraining I see for my signal.
For this to happen, I imagine there must be some fairly important differences between these methods. What are they exactly? Sorry if this has already been asked, I’ve not managed to find much information on the topic.
That’s weird, they should be exactly the same. (Really, the two options set the same internal variable). Might it be that something else changed between your runs?
If it is indeed persistent, a small script that reproduces the error would be helpful
So in trying to construct a simple script that can demonstrate the issue, I’ve managed to prove to myself that the methods do in fact give the same output. So all is well on the TMVA end - I must have accidentally changed another setting.
Sorry for wasting your time. I suppose while I’m here I might as well ask - if the methods give the same results, why was baggedsamplefraction written to replace gradbaggingfraction?
Glad that you worked your problem out
One of the options (I think it was
GradBaggingFraction) was implemented first, then there was a spring cleaning done to make the naming clearer. The old option was kept for backwards compatibility.