With multiple signal and background types, handling one signal or background type divided over multiple trees using weights

jwruss · January 14, 2019, 6:14pm

Hi Kim,

Thank you for your responses. I have reformatted the second question a little bit. If I don’t explicitly set dataLoader->AddBackgroundTree(treeB2, 1./6326631), like I do with treeB1a through treeB1d above, what happens if I don’t explicitly set weights?

MVAAllUpdate.C (14.2 KB)

This question relates back to the above file from this post. With the way I have set up my TMVA::DataLoader() objects here in the file, is my goal of treating the collection of thermal background trees (thermal0SumTree through thermal3SumTree) as a single background achieved with setting

int numThermal = thermal0SumTree -> GetEntries() + thermal1SumTree -> GetEntries() + thermal2SumTree -> GetEntries() + thermal3SumTree -> GetEntries();
double thermalWeight = 1. / numThermal;

dataLoader -> AddBackgroundTree(thermal0SumTree, thermalWeight);
dataLoader -> AddBackgroundTree(thermal1SumTree, thermalWeight);
dataLoader -> AddBackgroundTree(thermal2SumTree, thermalWeight);
dataLoader -> AddBackgroundTree(thermal3SumTree, thermalWeight);

Or if I do this, would I also have to also manually set weights to the other background trees as well? The other backgrounds and signals are successfully contained in single trees, unlike this one thermal background which is split over 4 trees. I worry that if I just add the 4 thermal trees as background trees but without weights, their differing number of events could bias collecting training and testing samples towards one of the thermal trees over the others, when I would like sampling to be equally likely between these trees.