Negative weight when using BDTG

Dear Experts,
I meet a problem with tmva. My train samples contain some negative weights. After applying BDTG , the “overtraining check” failed:
— Found directory for method: BDT::BDTG containing MVA_BDTG_S/_B
— Mean and RMS (S): 0.21677, 0.335244
— Mean and RMS (B): -0.277857, 0.381702
— Found comparison histograms for overtraining check
— Perform Kolmogorov-Smirnov tests
Error in TH1D::ComputeIntegral: Bin content is negative - return a NaN value

So what is the reason Mean(B) is less than 0?

Best,
Leon

Welcome to the ROOT forum! And I think @moneta will be happy to help you!

Glad to hear that :smile: I am quiet new to TMVA. The environment is tmva4.20 with root5.34.

Hi and welcome to the ROOT (online) community!

The reason for the error is most probably that you are using negative weights.

You can either:

  1. specify the bin width of the generated histograms by inserting (adapted from Kolmogorov-Smirnov test values) TMVA::gConfig().GetVariablePlotting()).fNbinsMVAoutput = 25; at the top of your training script (default is fNbinsMVAoutput = 40),

  2. or use a larger training sample!

Details

Negative event weights are sometimes included as correction terms in simulations, however, the events themselves are unphysical. A histogram modelling a probability distribution cannot contain bins with a negative sum, the events with negative weight only make sense in on average. This is what TMVA is complaining about! (The implication is that with the current training data the region in the histogram having a negative sum is poorly modelled.)