Problems with extracting a histogram from the fit model obtained with histFactory/Roofit

Veronica_Solund_Kirs · September 30, 2022, 5:22pm

Dear all,

I am performing a 2D fit in histFactory, and I would like to obtain the pulls in 2D to inspect the quality of the fit bin-by-bin.

So far I have not found a build-in function for this purpose, so I wanted to compute the 2D pulls myself. However, my problem is that I do not manage to create a histogram from the fitting model.

The fitting model is RooSimultanous, created like this:

RooStats::ModelConfig* modelConf = (RooStats::ModelConfig*)workSpace->obj("ModelConfig");
RooSimultaneous* model = (RooSimultaneous*)modelConf->GetPdf();

where the two variables are extracted as RooRealVar :

RooRealVar* BMCorrSig = (RooRealVar*)obs->find("obs_x_BMCorr");

RooRealVar* PIPIMSig = (RooRealVar*)obs->find("obs_y_BMCorr");

I had a look at the function “generate()” trying to do something like this:

RooDataSet* data_model = model->generate(RooArgSet(*BMCorrSig,*PIPIMSig,model>indexCat()),Name("test"),AllBinned(),NumEvents(100000));

TFile outputFile("data_model", "RECREATE");
data_model->convertToTreeStore();

but the root file is empty, so I was wondering if anyone would know how to actually do this correctly?

Thanks,

Cheers,

Veronica

jonas · October 1, 2022, 8:46pm

Hi @Veronica_Solund_Kirs,

I’m surprised that you’re saying “HistFactory” and “2D fit” in one sentence, because as far as I know, HistFactory only supports 1D fits in multiple channels. Maybe that’s what you mean?

Anyway, for HistFactory you always have one RooRealSumPdf for each observable, which you can plot together with the data and get a pull histogram like in this RooFit tutorial.

You just got to Print the workspace and figure out what the names of the RooRealSumPdfs are. Here some code snippet that you could append to the hf001 example to get the pull distribution for that case:

  // Now, do the measurement
  auto ws = MakeModelAndMeasurementFast( meas );

  ws->Print();
  ws->arg("channelCat")->Print();

  RooRealVar& x = *ws->var("obs_x_channel1");

  auto frame1 = x.frame(RooFit::Title("Model and measurement"));
  auto frame2 = x.frame(RooFit::Title("Pulls"));

  // Plot only the data in that observable's channel
  ws->data("obsData")->plotOn(frame1, RooFit::Cut("channelCat==channelCat::channel1"));
  ws->pdf("channel1_model")->plotOn(frame1); // and this is the RooRealSumPdf

  RooHist *hpull = frame1->pullHist();
  frame2->addPlotable(hpull, "P");

  TCanvas *c1 = new TCanvas("c1", "c1", 600, 300);
  c1->Divide(3);

  c1->cd(1);
  frame1->Draw();

  c1->cd(2);
  frame2->Draw();

  c1->SaveAs("plot.png");

  delete ws;

The plot would look like this, but I’m sure you can style it better

plot

Veronica_Solund_Kirs · October 3, 2022, 11:35am

Hi Jonas,

Thanks a lot for your reply

I am not sure if we mean the same thing when we say “2D fit with histFactory”. What I do and which is supported by histFactory is to fit in two variables, i.e. I use 2D histograms of signal and backgrounds as the “samples” to be fitted. As I understand the term “channel” (in the context of histFactory) is that it is a disjoint selection region in data, so in my case there is only one channel, but two observables. Does that make sense?

So I think, I am more or less doing what you suggested, i.e. I get the pull histograms for each of the two observables defined in my first message. However, what I would like to obtain is the pulls for each bin in both observables (not for observable 1 integrated over observable 2), such that I can inspect the quality of the fit in each bin of the fit.

This is why I would like to obtain the fit model as a 2D histogram (with the model uncertainties), such that I can compute the pulls in 2D by myself. Do you know a way to do this?

I found a (very not optimal) solution, which is to plot observable 1 in the range of variable 2 corresponding to a specific bin of observable 2. To do this I use the “RooFit::ProjectionRange(“bin_range_of_obs_2”)”. However, since I have 20 x 20 bins, I would have to repeat this twenty times, and then I get 20 different pull plots, and this is not ideal, since I would like to have one single 2D pull plot. However, maybe I could make this work, if I could just extract the raw values from the pull histograms and then make 2D pull plot myself, but I don’t manage to extract the content and error of the pull histogram (this is a RooHist object). Do you know how to get this information from a RooHist object?

Cheers,

Veronica

Veronica_Solund_Kirs · October 6, 2022, 10:48pm

Ok, so now I can obtain a 2D histogram of data and of my fitting model by using the function ‘createHistogram()’, e.g.:

    TH2D* histo_2D = (TH2D*)data->createHistogram("histo_2D", *BMCorrSig, Binning(20), YVar(*PIPIMSig, Binning(20)));

However, this function returns a histogram with too large uncertainties. I have attached a plot of the projections in each of the two observables and another plot with the the same projections using the plotOn() function, which gives correct uncertainties.

It seems like the Sumw2() is not applied to the histogram returned by createHistogram() (based on this code ROOT: roofit/roofitcore/src/RooAbsReal.cxx Source File ). Is this true? And does anyone know how to solve this issue, i.e. get a histogram with correct uncertainties?
createHistogram_wrong_uncertainties.pdf (15.8 KB)
ploton_correct_uncertainties.pdf (16.5 KB)

system · October 20, 2022, 10:48pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.