Extracting histograms from HistFactory generated RooWorkspace

I have a RooWorkspace generated with HistFactory, with a background histogram and a signal histogram, and a number of nuisance parameters on each of these.

How can I extract the morphed versions of each of these two histograms as TH1s, with the nuisance parameters set to specified values (i.e. both the fitted values, and nominal values)?

Most of this should work with functions from ModelConfig and RooWorkspace.

The code would look something like:

auto mc = dynamic_cast<RooStats::ModelConfig*>(ws->obj("ModelConfig"));

// Print the main pdf as tree:
mc->GetPdf()->Print("T");

// Use one observable from
auto obs = mc->GetObservables();
obs->Print();

// Retrieve a pdf and its observable by name
auto channelPdf = dynamic_cast<RooRealSumPdf*>(ws->pdf("channel1_model"));
auto obs = ws->var("observable");
// or
for (auto obs : *observables) {
...
}

// Plotting:
auto frame = obs->frame();
data->plotOn(frame);
channelPdf->plotOn(frame);
channelPdf->plotOn(frame, RooFit::Components("signal_channel1_shapes"), RooFit::LineColor(kRed));

You can set the nuisance parameters to different values, and plot again. You get them either with ws->var("..") or from the ModelConfig.

Alternatively, you can plot the simultaneous PDF (the top node of the expression tree) directly. For this, you need to project out single components as in
https://root.cern.ch/doc/master/rf501__simultaneouspdf_8C.html

That’s helped a little. I’m seeing something strange though: the first bin in the data plot is very large, orders of magnitude larger than it should be.

I turned the RooAbsData into a histogram with createHistogram and compared to the histogram in the file the Measurement object was saved into. All other bins are correct, but that first bin is, for example, 34571 when it should be 3218.38.

This of course throws off everything else when plotting.

That sounds like a bug. :slight_smile:

Could you check

data->get(0)->Print("V"); data->weight();
data->get(1)->Print("V"); data->weight();
... 

to find out whether the bins are wrong or whether createHistogram messes up the counts?
It might move the over- and underflow into the first and last bin.

I figured out what was happening. I didn’t cut on channelCat, and because each channel has it’s own observable, when one observable is plotted, the events from the other channels all end up in one bin. I’m not sure why this one bin isn’t an over / underflow bin.

root [4] d->get(0)->Print("V"); d->weight();
  1) RooCategory::             channelCat = resolved_4b_2015(idx = 0)

  2) RooRealVar:: obs_x_resolved_4b_2015 = 266.667
  3) RooRealVar:: obs_x_resolved_4b_2016 = 266.667
  4) RooRealVar:: obs_x_resolved_4b_2017 = 266.667
  5) RooRealVar:: obs_x_resolved_4b_2018 = 266.667
root [5] d->weight()
(double) 469.00000
root [6] d->get(1)->Print("V");
  1) RooCategory::             channelCat = resolved_4b_2015(idx = 0)

  2) RooRealVar:: obs_x_resolved_4b_2015 = 300
  3) RooRealVar:: obs_x_resolved_4b_2016 = 266.667
  4) RooRealVar:: obs_x_resolved_4b_2017 = 266.667
  5) RooRealVar:: obs_x_resolved_4b_2018 = 266.667

Oh, that’s bad. They probably end up in the bin that was active last, so it depends on the order in which things are saved in the dataset.

What we see in the printouts are the bin centres. When loading the next event, the observables that don’t belong to the respective category don’t get updated. That’s confusing, but it won’t harm the fit. For plotting, however, it’s not nice. Which command did you use to plot? Maybe some safety can be added to automatically add this kind of cut.

If you want to verify the bin counts further, we would also need the weight after d->get(1);.

What I’m doing now is applying the cut, so only the events from each category get plotted against that observable.

I’m only using this for diagnostic plots, so it’s not so bad. I’d still like to know how to pull the plotted TH1s or TGraphs out so I can make the final plots separately.

RooPlot::findObject() and RooPlot::Print() should help here.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.