Unbinned Fit using weighted events in RooFit

Hello,

I’m trying to do an unbinned ML fit using RooFit. The plot labelled as unweighted shows the fit for unweighted events and seems to be quite okay to me.
Plot weighted1 and weighted2 are the ones after reweighting events( weights in range [0.00807,2.5482652] for weighted1 and in range [7.7498647e-09,5.9192273e-05] for weighted2).

I have used the option 'SumW2Error(kTRUE) ’ while fitting the weighted events. As can be seen from the plots the uncertainities of the fit parameters becomes very high when the weights are very low (weighted2).

I was expecting that uncertainties will scale with the actual number of initial unweighted events when ‘SumW2Error(kTRUE)’ option is used. I have seen that this issue was reported in earlier threads but couldnt find the solution/conclusion.

For my analysis I really need to the fit after proper reweighting as used in weighted2 and i am not sure how reliable is the result with this high uncertainities of the fitted parameters.

Any comment or suggestion is highly welcomed.
Thanks in advance.

Welcome to the ROOT forum,
I think @moneta can help you.

1 Like

@moneta

Hi @sweta!

I have written a little script with fits of a RooCrystalBall to weighted toy datasets, trying to reproduce your problem:
weighted_dataset_fits.py (2.2 KB)

However, I don’t observe these crazy large errors for the dataset with the small weights. Are you sure that SumW2Error is enabled? Which ROOT version are you using?

Maybe the approximations of the SumW2Error are not appropriate for your dataset and model. Have you tried out the AsymptoticError correction that is documented in RooAbsPdf::fitTo()?

In any case, it would also be great if you could provide the code to reproduce the results in plots that you are showing in the initial post. We can better help to find the problem then.

I hope this helps a bit!
Jonas

Hi @jonas

Thanks a lot for the reply. The ROOT version i used for the plots in my first post was 6.06 where i used ‘RooDoubleCrystalBall’ pdf which is defined in Higgs Combine tool package which I need to use for producing the final statistical results. Yes, SumW2Error was enabled.

After seeing you use ‘RooCrystalBall’ for Double sided crystal ball model I tried running my script in ROOT 6.26 (since as far as i discovered this works for ROOT versions > 6.24) and yes the large uncertanities vanishes(attached plot weighted2_)!!

  1. So this is indeed some ROOT version issue I suppose .?
    Comparing with the earlier result (weighted2), the final parameter values are exactly same but the uncertainities reduce drastically. Can you give an impression why?

  2. Also I really need to work on ROOT 6.06 or 6.12 (so as to use the Combine tool package) so is there any trick i could use? I was thinking of one though I am not sure how legitimate is that. So if I multiply each event’s weight by a same constant factor so that the minimum weight is atleast > 0.01, this should not change the shape of the distribution and give reasonable uncertainities right? Finally i divide the fit normalisation with that constant factor to get the true normalisation.

I am also attaching the code(jupyter notebook downloaded as python file) i use for your reference.
Thanks again :slight_smile:

weighted2_
Fit_with_weights.py (1.7 KB)

Hi!

Yes, probably it’s related to a bug that got fixed in the meantime.

I don’t think it’s worth it to go back and try to make it work with a ROOT version that is more than 5 years old. Why don’t you just use combine with 6.24? The current 112x development branch of combine is ROOT 6.24 compatible:

As far as I know, CMS wants to make that the recommended release for analysis very soon anyway. And you have a very good reason to use it, if the old releases have a problem with the weight corrections :slight_smile:

Hi @jonas ,
Yes I completely agree with you. Thanks for the info about the 112x branch of combine. I could successfully compile the HiggsAnalysis-CombinedLimt but the CombineHarvester Package is showing some compatibilty issue. Its mentioned in the repo that its not tested or validated yet . So i get that. Hopefully it will get working soon. For now HiggsAnalysis-CombinedLimit package will be sufficient for me.

Thanks again for all the help.
Sweta