TMVA: adding several backgrounds

Hi!

I am trying to train some BDTs with TMVA, where I have one signal and several backgrounds, where all of them have different numbers of entries.
Let’s say I have 1 Million signal events, and I want a signal to background ratio of 1:1.
In my background trees, I have however 7, 2.5 and 0.5 Million events.

So what I would want is to scale those down to 0.7, 0.25 and 0.05 Million events. I am not completely sure if by handing over the complete trees with weights and then doing something like

dataloader->AddBackgroundTree(bgtree1, 0.1)
dataloader->AddBackgroundTree(bgtree2, 0.1)
dataloader->AddBackgroundTree(bgtree3, 0.1)

achieves what I want to get here.

Or can I do something like having weights that add up to 1 like
dataloader->AddBackgroundTree(bgtree1, 0.7)
dataloader->AddBackgroundTree(bgtree2, 0.25)
dataloader->AddBackgroundTree(bgtree3, 0.05)
so that TMVA gets the correct ratio of each background?

I would be grateful for some clarification if I need to adjust my trees manually outside of TMVA somehow to get the right amount of each background component or if one of the above does what I think?

Thanks a lot!

Alex

1 Like

I guess @moneta can give you some help.

Hi,

You can either do the scaling manually as you propose or let TMVA take care of it for you. With the manual approach, the tree weight is applied to each event in the tree individually. Thus, if you want to scale 7 million events down to 0.7 million effective events you’d use your alternative one:

dataloader->AddBackgroundTree(bgtree1, 0.1)
dataloader->AddBackgroundTree(bgtree2, 0.1)
dataloader->AddBackgroundTree(bgtree3, 0.1)

It is also possible to let TMVA take care of this normalisation for you. In the dataloader PrepareTraninigAndTestTree you can specify the option NormMode=NumEvents or NormMode=EqualNumEvents where the former normalises the number of effective events in the first class to be equal to the sum of all others. The latter normalises all classes to have the same number of effective events.

For further information you can see the TMVA User’s guide chapter “Preparing the training and test data”.

You should also get a textual output after dataloader-> PrepareTrainingAndTestTree(...) where you can verify the number of effective events in each class.

Cheers,
Kim

Thanks a lot Kim for the clarifications!

Best,
Alex

1 Like