RooFit: SumW2Error

carlsonbt · August 8, 2013, 8:56pm

Hi
I am trying to do an unbinned fit of weighted events. I’ve read a lot in the forums about this that show that there are issues here, so I wanted to double check that it was doing what I expected. I am using Root 5.34.

I first create a RooDataSet with weights between 0.9-5, then I perform a fit.

[code]RooFormulaVar wFunc("w","event weight", weight,m); 
    //weight from 0.9-5
RooDataSet data("data","data",RooArgSet(m,pt,y),Import(*fTree)); //make unbinned data set
RooRealVar *w = (RooRealVar*)data.addColumn(wFunc); 
    RooFitResult *FR(0); 
FR=sum.fitTo(wdata,Extended(),SumW2Error(kFALSE),PrintEvalErrors(-1),PrintLevel(-1),Verbose(kFALSE),Save());
 //or

FR=sum.fitTo(wdata,Extended(),SumW2Error(kTRUE),PrintEvalErrors(-1),PrintLevel(-1),Verbose(kFALSE),Save());[/code]

The result for both SumW2Error true and false is that the relative uncertainty [sigma_N/N] on my parameter of interest (should extract signal yield) ranges from 6%-2.5% as the weight goes from 0.9-5. I’m surprised by this, because I would hope that the event weights would not change the relative uncertainty.

Say for argument N=270 events. The relative uncertainty should be ~sqrt(N)/N~=6%. Now suppose we weight by a factor of 5, N’=5*N=1350, and then the relative uncertainty is 2.7%. That suggests to me that the code is actually using the weights when computing the uncertainty, which is precisely what I do not want it to do…

Is there something I am doing wrong in setting up the weights?

Ben

moneta · August 15, 2013, 12:59pm

Hi,

Yes, the event weights should not change the relative uncertainty. Can you please send us the running example reproducing the problem

Lorenzo

carlsonbt · August 16, 2013, 10:22pm

Hi
Attached is a sample toy that shows the issue I have encountered.

It is done for an unbinned sample, but the same issue seems to occur for the binned case.

Please let me know if you come up with a fix. In the mean time, I have found that for data, I can do a fit to unweighted events, then do a fit to weighted events. Then I take the relative error from the unweighted, and compare with the relative error for the weighted events times the average weight used, and the agreement is within 0.5%. Probably good enough for now… but I doing fits on weighted events is important in principal.

Ben
roofit_test.c (2.09 KB)

moneta · August 19, 2013, 8:59am

Hi,

Thank you for the example code. I could run it, I will look then into this

Lorenzo

moneta · August 19, 2013, 9:08am

Hi,
If I put SumW2Error(kTRUE), you should do this with weighted data, I get:

weight=1
yield: 1000 +/- 32 rel. error: 3.16%

weight = 4
yield: 4000 +/- 100 rel. error: 2.50%

I think the value makes more sense now, so I don’t see really the problem

Lorenzo

carlsonbt · August 19, 2013, 3:42pm

Hi
I’m not entirely convinced of this. It seems to be the case that the SumW2Error(kTRUE) gets slightly smaller uncertainties. As you increase the weights, this does flatten out. I am attaching 2 plots. I have set N_events=100 instead of 1000 to exaggerate the effect. The top plot is what you get for SumW2Error(kFALSE). The bottom plot is what you get using SumW2Error(kTRUE). The top plot actually makes more sense to me, since the fractional uncertainty scales as 1/sqrt(w) (the red line is a fit, which is very good). THe bottom plot, I don’t fully understand. The weights should have no effect on the uncertainty. Or at least the uncertainty should not go down. If you take the ratio of fractional uncertainty at a weight of 50 to no weights, you get 0.7. This is independent of the number of events.

If I am going to weight data and do a fit to it, and use the uncertainties from the fit, the uncertainties should not go down. Even an effect like 0.7 of the fractional uncertainty could make a difference. Say I’m doing a fit and the fit uncertainty should be 25%. That means if I use weighted events, my uncertainty could be as low as 17%!

Ben
roofit_test.c (3.29 KB)

jdevries · March 20, 2014, 1:20pm

I have exactly the same issue. Is there any progress on the usage of SumW2error?

moneta · March 24, 2014, 5:28pm

I agree seems to be a problem when using an extended likelihood fit. If you do a standard not extended fit the error comes out correct for weighted data set. I will investigate it

Lorenzo

moneta · March 26, 2014, 8:26pm

Hi,

I have fixed this issue in both the 5.34 patches and the trunk. It is giving now the correct result, an error which is independent of the applied weight.

Thank you for reporting this issue and providing the example

Lorenzo