Dear Experts,
I meet a problem with tmva. My train samples contain some negative weights. After applying BDTG , the “overtraining check” failed:
— Found directory for method: BDT::BDTG containing MVA_BDTG_S/_B
— Mean and RMS (S): 0.21677, 0.335244
— Mean and RMS (B): -0.277857, 0.381702
— Found comparison histograms for overtraining check
— Perform Kolmogorov-Smirnov tests
Error in TH1D::ComputeIntegral: Bin content is negative - return a NaN value
The reason for the error is most probably that you are using negative weights.
You can either:
specify the bin width of the generated histograms by inserting (adapted from Kolmogorov-Smirnov test values) TMVA::gConfig().GetVariablePlotting()).fNbinsMVAoutput = 25; at the top of your training script (default is fNbinsMVAoutput = 40),
or use a larger training sample!
Details
Negative event weights are sometimes included as correction terms in simulations, however, the events themselves are unphysical. A histogram modelling a probability distribution cannot contain bins with a negative sum, the events with negative weight only make sense in on average. This is what TMVA is complaining about! (The implication is that with the current training data the region in the histogram having a negative sum is poorly modelled.)