Weighted data set

Hi experts,
I have a couple of questions w.r.t. the handling of weighted events in roofit. My goal is to create a PDF from a ttree, in which one of the leaves contains the event weights. In ROOT, I could simply fill a histogram, accumulate the weights, and SetBinError at the end.

Q1) If I make such a histogram, and set a binned RooDataSet to be equal to this histogram, will the errors on the bins in roofit reflect my hand assignment of weights?

Q2) Is there a way of using the KEYS kernel to smooth an already binned dataset? The ROOT function “Smooth” for histograms seems a likely candidate, but it doesn’t work for high dimensions, like THnSparse, and I don’t know what the algorithm is, so I can’t be sure.

Q3) Putting aside the smoothing: if I set the bin errors by hand, will that information be accounted for in a roofit minimization? That is, will the uncertainty returned after the fit use the bin errors that I set by hand, or default to ~1/sqrt(N)?

My heaven is:

  1. Grab a tree from a root file, and make a histogram setting the bins by hand to have correct uncertainty. ( I know how to do this).
  2. Smooth this according to some adaptive algorithm-i.e. one that smooths bins less if the local density is high.
  3. Create a roofit PDF from this smoothed histogram with the uncertainty from pervious steps retained.
  4. Fit my data to this beautiful PDF.

Perhaps my heaven is unattainable. But I must try.

Many Thanks!


A1): When making a RooDataHist from a ROOT histogram the error will be computed taking into account correctly the weights of the histogram, if you have set TH1::Sumw2.

A2) You can also make a Keys PDF in 1-dim or multi-dimension using the weights. However a keys pdf can be made from an unbinned data set (a RooDataSet and not a RooDataHist). It can be made using weights.
If you don’t have access to the original data points, you could also interpret an histogram as a weighted RooDataSet, where each data point is the bin center of the histogram.

A3) When performing a fit, the correct handling of the weights will be used if the fitting option
RooFit::SumW2Error(true) is used


1 Like

Great. To make a keys pdf with weighted events, it suffices to specify the variable holding the weights when declaring the RooDatSet–is this true?

In other words, if I create a data set by
RDS = new RooDataSet (“myData”, “my data”, data, vars, “”, “eventWeights”);
then make a KeysPdf from this, the eventWeights will automagically be included?

W.r.t. correct handling of weights in the fit, I ask because I know that the maximum likelihood fit currently doesn’t support fully correct handling of weights in returning the uncertainty. But perhaps if the weighted events are only in the PDF then it will be ok–is this true?

In other words, if there are no weighted events in the data to be fitted, but inly in the PDF, and I use RooFit::SumW2Error(true), will a likelihood fit return the correct uncertainty?

Many thanks!