Hi experts,
I am using TMVA to do classification training with Signal and Background events from separate root file (with preselection cut so that I ended up with more signal than background, see the training output below).
: Number of training and testing events
: ---------------------------------------------------------------------------
: Signal -- training events : 1295451
: Signal -- testing events : 1295451
: Signal -- training and testing events: 2590902
: Background -- training events : 175046
: Background -- testing events : 175046
: Background -- training and testing events: 350092
: Dataset[dataset] : Background -- due to the preselection a scaling factor has been applied to the numbers of requested events: 0.135124
when it come to the training stage, i am expecting 350092 events for both signal and background. but it turned out that the events used is almost 2 times from what i have expected.
#events: (reweighted) sig: 735248 bkg: 735248
#events: (unweighted) sig: 1295451 bkg: 175046
My question is that how TMVA factory do the rewieigth on those events?
am i getting duplicated events for my background?
Also, I would like to ask what is weight and its purpose in TMVA’s training.
Thanks.
_ROOT Version: 6.24 (PyROOT via conda)
_Platform:Centos7
_Compiler: gcc9