Not observing a difference between weighted data error handling methods in RooFit

bkowal · January 30, 2025, 9:51pm

ROOT Version: 6.30/02
Platform: Ubuntu 20.04.6 LTS
Compiler: gcc 12.3.0

Hello; I have been trying to understand how weighted data are handled for extended likelihoods in RooFit.

I’ve read a paper describing three different ways to handle the fit parameter errors (applying just the standard weights, applying SumW2 weights, or the asymptotic limit), which shouldn’t change the fit parameter itself (as this should just be the minimized Scaled Poisson Distribution), but just the errors on those fit parameters.

I tried to do a simple test where I generate a 1000 event gaussian RooDataSet, and apply weights to each event in the dataset. Then I bin the dataset, convert it into a RooDataHist, and fit to a crystal ball shape, where I seperately fit with all three different weighted error handlings. When I did this, they all report the same exact output on the terminal (i.e. same fit parameters and same uncertainties on those parameters).

The code I’ve used is below:

  RooRealVar x("x", "Observable", -3, 3);
  RooRealVar weight("weight", "Event Weight", 0, 100);  // Define weight variable

  RooDataSet data("data", "Weighted dataset", RooArgSet(x, weight), WeightVar(weight));

  // Fill dataset with events
  for (int i = 0; i < 1000; i++) {
    x.setVal(gRandom->Gaus(0, 1));  
    double weightval = gRandom->Uniform(.5, 2.);
    weight.setVal(weightval);  // Random weights between 0.5 and 2.0
    data.add(x, weightval);  // Add weighted entry
  }
  x.setBins(50);
  RooDataHist dataHist("dataHist", "dataHist", RooArgSet(x), data);

  RooRealVar N("N", "N", 1000, 100, 5000);

  RooUniform b("b", "b", x);

  RooRealVar mu("mu", "mu", 0.01, -2, 2);
  RooRealVar sig("sig", "sig", 1, .1, 3);
  RooRealVar n("n", "n", 1, .01, 10);
  RooRealVar a("a", "a", 1, .01, 10);

  RooCBShape g("g", "g", x, mu, sig, a, n);

  RooExtendPdf extendedsig("extendedsig", "extendedsig", g, N);

  RooFitResult *result = extendedsig.fitTo(dataHist, Strategy(2), Verbose(false), PrintLevel(0), Hesse(true), Save(true), AsymptoticError(true), SumW2Error(false));
  //  RooFitResult *result = extendedsig.fitTo(dataHist, Strategy(2), Verbose(false), PrintLevel(0), Hesse(true), Save(true), SumW2Error(true), AsymptoticError(false));
  //  RooFitResult *result = extendedsig.fitTo(dataHist, Strategy(2), Verbose(false), PrintLevel(0), Hesse(true), Save(true), SumW2Error(false), AsymptoticError(false));
  RooPlot *frame = x.frame();
  dataHist.plotOn(frame, DataError(RooAbsData::SumW2));
  extendedsig.plotOn(frame);

I originally did this with a gaussian fit, but I was given the suggestion to use a crystal ball to make any differences between methods more apparant. I also tried to repeat these fits, but commenting out the x.setBins(50) line, and fitting to data rather than dataHist, to compare binned vs unbinned datasets.

When I change the verbosity settings to True and the PrintLevel to 2 and look at the results, I noticed that SumW2Error(True) was different from AsymptoticError(True) and SumW2Error(False), but AsymptoticError(True) and SumW2Error(False) had identical parameter errors. This happened for both binned and unbinned.

My two questions are:

Why was the different error calculations not apparant for the lower verbosity when I saw the fit results on the terminal?
Why are there no differences between SumW2Error(false) and AsymptoticError(True)?

I’d appreciate any help with this!

Respectfully,
Becky

silverweed · January 31, 2025, 9:10am

Hello,
@jonas should be able to answer your questions.

bkowal · February 10, 2025, 3:26pm

Thank you! Is there an update to this topic? Is there any intuition I could apply to see whether or not I would expect the AsymptoticError(True) and the SumW2Error(False) to look the same? Thanks again!

Respectfully,
Becky

system · February 24, 2025, 3:26pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.