More Friendly interface to plot sPlotted data and deal with datasets in general

Hello,

When working with the RooStats::sPlot class we get a dataset with multiple columns that can be interpreted as weights. For instance, if the model is:

PDF=S(m) nsig + B(m) nbkg

the resulting dataset will have the columns nsig_sw and nbkg_sw added. However, in practice, these columns cannot be read as weights, because weights have to be declared as such when the dataset is built. Also, it seems that you can have only one weight.

In my use case, I need to generate a toy, calculate sweights and then use the dataset to plot on top of the S(c) and B(c) where c is a control variable so that we make sure that the background (signal) subtraction is trully subracting the background (signal). Ideally I would do something like:

data->plotOn(frame, ROOT.RooFit.Weight('nsig_sw'))
s_c->plotOn(frame)

However the dataset needs to be reformated with a function like:

def get_data_weighted(data, obsname, weightname):
    s_var = data.get()

    obs = find_var(s_var,    obsname)
    wgt = find_var(s_var, weightname)

    s_var_w = ROOT.RooArgSet(obs, wgt) 

    dataname = '{}_{}_{}'.format(data.GetName(), obsname, weightname)
    data_wgt = ROOT.RooDataSet(dataname, '', s_var_w, weightname)

    nentries = data.numEntries()
    for i_entry in range(nentries):
        data.get(i_entry)
        wgt_val = wgt.getVal()
        data_wgt.add(s_var_w, wgt_val)

    return data_wgt

So, If I want to treat both columns as weights I need to make two extra datasets.

There are multiple approaches to get around this. The easiest way would be to include a RooFit::Weight(const char *) object that would tell the plotOn function to treat a specific column as a weight.

However that would not propagate to things like fitTo, so maybe a better way would be to redesign the RooAbsDataSet to allow multiple weights, like:

data.setWeight('a')
plot_or_fit(data)

data.setWeight('b')
plot_or_fit(data)

data.setWeight('a * b')
plot_or_fit(data)

so that the same dataset can be reused. This would be less wasteful and simpler to use.

Cheers.

Maybe @moneta can comment on this

Hi,

I would need to understand better the use case. Are you here at CERN or can we maybe have a chat on Zoom to discuss this ? It would be useful

Best regards

Lorenzo

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.