Hello,
I have a follow up question stemming from the discussion in this thread.
Suppose one of the classification samples has n0 events, with each event having an importance weight (w_i) between 0 to 1. Also suppose I have at least 2 other classification samples with n1 and n2 events, but no importance weight assigned to them. For all intents and purposes the n1 and n2 events in each of these samples can then be thought of all have importance weights of 1.
The question is this: If I want to consider the significance weights in addition to sample sizes of these three samples, then when I would implement the lines
dataloader.AddSignalTree(& chain0);
dataloader.AddBackgroundTree(& chain1, n0 / n1);
dataloader.AddBackgroundTree(& chain2, n0 / n2);
should I replace n0, n1, n2 in the weights n0 / n1
and n0 / n2
with the sum of the weights in each corresponding sample? Or should I replace n0, n1, n2 with the corresponding Kish’s Effective Sample Sizes? Or should I keep n0, n1, n2 as they initially are?
Initially I thought that I should just sum the weights together like in the first instance after a colleague suggested it, but now I’m wondering if I should use the second instance instead after the same colleague later pointed out that the sum of weights for a given class isn’t necessarily directly equivalent to the effective number of events in that class. In each instance asked about here the n1 and n2 would not be changed in the weights n0 / n1
and n0 / n2
, but n0 would.