SumW2Error(true) with simultaneous fit

vecchi · May 14, 2019, 9:21am

Dear experts,
I need to fit simultaneously 4 data samples. Each sample are weighted for signal weights determined in previous, independent fits.
These weights should be considered in the determination of the final uncertainties according to the factors
F = sum_of_weights/sum_of_weights_squared,
(Notice that the background level in the 4 samples are different, so F, which is in the range 0.6-0.9)

I made two tests:

simPdf->fitTo(combined_data,SumW2Error(true),Minimizer(“Minuit2”,“Migrad”)…)
In the log I see the printouts below, that I understand means that a rescaling is applied
[#1] INFO:Fitting – RooAbsPdf::fitTo(simPdf) Calculating sum-of-weights-squared correction matrix for covariance matrix
[#1] INFO:Minization – RooNLLVar::evaluatePartition({Run1;K3pi}) first = 0 last = 5125 Likelihood offset now set to 3386
[#1] INFO:Minization – RooNLLVar::evaluatePartition({Run2;K3pi}) first = 0 last = 10869 Likelihood offset now set to 9398
[#1] INFO:Minization – RooNLLVar::evaluatePartition({Run1;Kpi}) first = 0 last = 3696 Likelihood offset now set to 4372
[#1] INFO:Minization – RooNLLVar::evaluatePartition({Run2;Kpi}) first = 0 last = 11672 Likelihood offset now set to 1.583e+04
simPdf->fitTo(combined_data_withRescaledWeights,SumW2Error(false),Minimizer(“Minuit2”,“Migrad”)…)
in this case I “manually” scaled the sweights so that sum_of_weights/sum_of_weights_squared=1 in each sample separately. This way the error recalculation is not needed.

The results of the two fits are not the same: they slightly differ both for the fit value and the uncertainty, which is in 1) is larger than in 2)

Questions:
how the error rescaling is computed in 1) ?
I would think that method 2) is the the correct one, as each sample is accounting properly for its own background level. Do you agree?

Thanks
Stefania

StephanH · May 14, 2019, 3:59pm

Hi Stefania,

See ROOT: RooAbsPdf Class Reference
In the table, the SumW2Error() is discussed.

When scaling the weights, I would indeed assume that the fit converges to the same minimum, except for numerical uncertainties when calculating the optimal step size during the minimisation. Are the differences in central values large?

Could it be that the difference in uncertainties arises from the fact that for the covariance matrix, weights are scaled globally for all channels in RooFit, whereas you scale them in each category separately? In this case, you would probably still require a correction to the covariance matrix.

I invite @moneta to help out with this.

moneta · May 14, 2019, 8:06pm

Hi,

If the events have a weight different than one in the likelihood, a correction needs to be applied. I am not sure that the procedure 2. is correct. Procedure 1. should give a correct result, taking into account the approximation done using weights

Lorenzo

vecchi · May 15, 2019, 9:08am

Dear Lorenzo and Stephan,
thanks for the replies. I read the documentation but still it is not clear to me how this correction to errors and covariance matrix in the specific case of a simultaneous fit.

I’ll try to explain a bit better my point.

According to literature the correction is C = sum_i (w_i) / sum_i (w_i * w_i), where the sum is done on all the events of the sample, and w_i is the signal event weight.
In case of a single sample, I have no problems understanding what RooFit does to recompute the errors in case I select SumW2Errors(true). I get the same results if I fit the sample with SumW2Errors(false) using modified weights w’_i = w_i * C.

In the case I fit more samples (j),

if I perform independent fits and I use SumW2Errors(true), each one would apply its own C^j correction. Following this method, I have to make an average of the fit parameters to get the final result. Alternatively, I could perform a simultaneous fit to all the samples and using modified weights in each sample w’^j_i = w^j_i * C^j and use SumW2Errors(false). With this trick all samples would have a renormalised C’^j factor = 1 [== procedure 2 in my message]
if I perform a simultaneous fit to all the samples and I use SumW2Errors(true) using weights w^j_i, I suspect that the correction will be C’ = sum_j [sum_i (w^j_i)] / sum_j [sum_i (w^j_i * w^j_i)] which is clearly different from the previous case (in the case C^j are different.) [ == procedure 1 in my message]

I hope this clarifies a bit my problem and my question.
I am still convinced that procedure 1 may lead to wrong results in case of simultaneous fits with C^j different in each sample, but you have the final word.

Thanks
Stefania

vecchi · May 28, 2019, 10:07am

Any comment on my last reply?
thanks
Stefania

StephanH · May 29, 2019, 5:32pm

Hi Stefania,

I thought a bit more about this, but I think that only a global rescaling of the error matrix will work. Since a simultaneous fit means to multiply the likelihoods of the different categories, events should be weighted with whatever the appropriate weight is, but the weights should not be scaled within each category.
If all events in category 1 need a weight of A, and events in 2 a weight of B, you need to apply these weights to correctly convey the statistical power of each category. Note that the statistical power of the fit scales with Nevt(A) + Nevt(B) (times appropriate weights), so in terms of statistical uncertainties it looks like you fit in a single category with A+B events (times appropriate weight).

If you do single fits in each category, you will get different (maybe contradicting) fit results in each category as you pointed out. These are really difficult to combine because there might be correlations, so a simple statistical combination using the 1/(post-fit errors)^2 is probably not possible.

So far, event weights were not really relevant in what I wrote. I tried to make clear that the total number of events is relevant for the final errors. Let’s now look at event weights:

If you do a simultaneous fit and rescale globally (method 1), you correctly take into account the statistical power of each category (because each has weights), and you also get correct total uncertainties because the covariance matrix is corrected for having sumW(over all categories) events.
Note that with this method, there is no rescaling per category. The statistical power of each category is just the sum of weights that are coming in in relation to the global sum of weights in all categories. Note that the rescaling happens globally after the fit.
If you do a simultaneous fit, but you scale each category with a random factor, you might completely change the number of events that the fit “sees”. This is because you scale up or down the statistical power of each category based on the condition that you want sumW / sumW^2 == 1. Since this depends on sumW2, this scale factor might be arbitrary, and you will randomly change the relative power of each category, and therefore also the fit result.
Imagine that you have two categories A and B:
A and B both have sumW == 10. Both have equal power in the fit. You rescale A by 0.9, and B by 0.09 (because of what sumW2 happens to be), and suddenly, your fit only pays attention to A, even though your weights say that the two should be equally important.
Further, it now looks like you fitted to 9.9 events in total instead of 20 because in addition to the event weights, you also have somewhat random scale weights. The correction of the covariance matrix will not scale the errors such that if it looks like 20 events have been fitted, it will make them bigger to look like it was 9.9 events.

Is this what you intend by scaling? Maybe I misunderstand what these signal weights are that you mentioned in your first post. Why do you need them and what do they do?

system · June 12, 2019, 5:32pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.