I am dealing with an effective field theory scenario where my signal (the interference term) has almost equal proportion of negative and positive weight events. To try and train the samples, I was following this suggestion: TMVA Toolkit for Multi Variate Analysis / Re: [TMVA-users] BDT training with negative weights., i.e., using the BDTG method and using the “Pray” option. However, I am getting the following error and am unable to train the samples.
: signal and background histograms have different or invalid dimensions:
I am attaching my code and the problematic root file required for the training.
I would greatly appreciate if you could guide me as to how I can train such samples properly.
Hi,
Thank you for sharing the files, I will try to reproduce your problem. However, I think if negative weighs dominates on the positive ones, this can be problematic because you end up with some statistics, like the input data variable variances being negative. This could be the cause of the errors you observe
Many thanks. I understand the practical difficulty. However, for NLO samples as well as interference samples, such issues can become important physics-wise.
I would be very happy to know if there’s a work around to work with such scenarios, even if it’s not BDTG.
Hi,
Looking at you macro, I see that the cause of the error you are getting is due to negative variances.
These are used to some of the input transformations, like decorrelations.
Just use minimal transformations, for example using Transformations=I,G,P seems to work for me on your input data
Thanks a lot. Indeed changing to Transformations=I,G,P is working without any run-time errors. However, the results are not making any sense. I see that the input variables are shown as something that I don’t expect and this is the same for the BDT variable.
I am still working with the “Pray” option.
I would really appreciate if you can look into the graphs once.
Hi,
Sorry for my late reply. I could run your code on your file. The results seem ok to me, but it is difficult to judge since I don’t know your problem.
In any case , here is the output ROOT file I get, TMVA.root (2.2 MB)
Thanks a lot for replying and for checking the code. I just have a few questions and clarifications to make.
Did you finally take the following transformation (code is also attached)?
TMVA::Factory *factory = new TMVA::Factory( "TMVAClassification", outputFile,
"!V:!Silent:Color:DrawProgressBar:Transformations=I,G,P" );
With the following transformation, I am having a hard time understanding the input variables, the BDTG score and the significance graph. Are you also getting similar plots? I am unable to make any sense of these plots. The conventional way that I am used to seeing the BDT score is quite different from what is reflected in this graph.
Apologies for bothering again but I would really appreciate if you help me understand these graphs or if you used some other tweaks in your code. May be I am missing something.
Hi,
I have been using only “I” as transformations. The others , due to the negative weights, might have problems.
You might need some dedicated transformations for your case, that correctly take into accounts negative weights.