Fit to a weighted, unbinned data set

I want to do a fit to a weighted data set, and I’d like clarification on something that doesn’t quite make sense to me.

It seems I have to both declare a weight variable (and assign its values) and tell add the value of the weight:

import ROOT
x = ROOT.RooRealVar('x', 'x', 0, 10)
w = ROOT.RooRealVar('w', 'w', 0, 1)
obs = ROOT.RooArgSet(x, w)
data = ROOT.RooDataSet('data', 'data', obs, ROOT.RooFit.WeightVar(w))
for i in xrange(1, 101):
    x.setVal(float(i) / 10)
    w.setVal(float(i) / 100)
    data.add(obs, w.getVal())

In this case, data.Print() gives

RooDataSet::data[x,weight:w] = 100 entries (50.5 weighted)

It seems like I should only have to do one or the other of these. Why do I have to tell RooFit the value of the weight twice (both in the value of w and again when I add it)?

Elaboration:

In rf403_weightedevts.C, it looks like it is enough to add a variable to a RooDataSet and declare that it is a weight, e.g.:

obs = ROOT.RooArgSet(x, w)
data = ROOT.RooDataSet('data', 'data', obs, ROOT.RooFit.WeightVar(w))
for i in xrange(1, 101):
    x.setVal(float(i) / 10)
    w.setVal(float(i) / 100)
    data.add(obs)

but after I do this, each entry seems to have a weight of 1. data.Print() gives

RooDataSet::data[x,weight:w] = 100 entries (100 weighted)

If I specify a weight value instead of a variable, however, no weight seems to be assigned at all:

obs = ROOT.RooArgSet(x)
data = ROOT.RooDataSet('data', 'data', obs)
for i in xrange(1, 101):
    x.setVal(float(i) / 10)
    data.add(obs, float(i) / 100)

and data.Print() gives

RooDataSet::data[x] = 100 entries

So it looks like I have to tell RooFit the value twice: once in the value of w and again in the call to add.

Hello @mwilkins,

it depends where the data come from.

1. Data being written into memory

If you set the data manually, you definitely have to save the value of the weight for each event. That’s why data.add(...., <weight>) is absolutely necessary.
Further, you have to tell RooFit that you want a weighted dataset, hence RooFit.WeightVar(w).

The line
w.setVal(...)
is ignored, because data.add(..., weight) overwrites whatever value was assigned to w before.

2. Data already in memory
In rf403, a dataset is created, and the values of each variable (also the weight) are written into memory as normal variables using <variable>.setVal(...). At this stage, there’s no notion of a weighted fit.
Then, a new dataset wdata is created, where w is reinterpreted as weight.
That’s why no add(<variables>, weight) is necessary.

As you see, both flagging the variable and actually setting the values for each event have to be done in both cases. Only the strategy how to achieve these two steps is different.

Your example
In your last example, the weight is ignored because RooFit.WeightVar() is missing in the constructor for the dataset.

I will add a check for the next version of ROOT. If you try to set a weight without registering a weight variable, RooFit will complain in the future.

Thank you for your helpful clarification. I also suggest improving the Doxygen documentation for the relevant functions.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.