How to use ShapeFactor in HistFactory

Hello,

I am working on setting up a model with HistFactory, and I would like to ask for an explanation/short example on how to use the ShapeFactor technology.

From the HistFactory manual, I see that it is a way to allow for unconstrained bin-by-bin systematics. But I am unsure of how to apply it.

Any advice?

Thanks,
Arvind.

Hi,

The ShapeFactor defines overall bin by bin variations of a channel which are not constrained.
This means that for those variations you will not have a lower and upper error that is parametetrized with a Gaussian or a Log-normal distributions, as it is done for the ShapeSys systematics.
This is possible because their value will be constrained when fitting the data
As an example of usage of ShapeFactor you can see the xml files present in the
tutorials/histfactory directory.

The top level xml file is example_DataDriven.xml which defines the measurement and then you have two xml files defining two separate channels:

  • example_DataDriven_signalRegion.xml defining the signal region channel.
  • example_DataDriven_controlRegion.xml defining the control region channel.

Both channels define for the background2 a ShapeFactor.
To run the example do :

> prepareHistFactory
> hist2workspace config/examples/example_DataDriven.xml 

and you will get a workspace containing the ShapeFactor parameters

Best regards

Lorenzo

1 Like

Hi Lorenzo @moneta,

Thanks for getting back to me. I understand a little more now, and have a follow up question.

I am doing a 2 component simultaneous fit in two channels (lets call them Channel 1 and Channel 2), both of which have signal and background. I am making the assumption that the background shape is shared between the two channels. Further, I would like the background shape in Channel 2 to be completely data-driven. So I have to add the shape-factor to the background sample in Channel 2.

Should I then also add the shape-factor to the background sample in Channel 1? I don’t want the background shape to be data-driven in Channel 1. I want it to be determined only by Channel 2.

Thanks a lot for your help,
Arvind.

Hi,
Have you tried removing the ShapeFactor from Channel1 and keep it only for Channel2 ?
In that case you could keep an overall constraint systematics for the extrapolation and use a nominal histogram for the background shape of channel 1

Lorenzo

Hi Lorenzo @moneta,

That is what I do now.


  Sample bkg_Channel2("bkg", Channel2_HistName, histFileName); //completely data driven

  bkg_Channel2.SetNormalizeByTheory(kFALSE);

  bkg_Channel2.AddShapeFactor("sf");

  Channel2.AddSample(bkg_Channel2);

  Sample bkg_Channel1("bkg", Channel2_HistName, histFileName);

  bkg_Channel1.SetNormalizeByTheory(kFALSE);

  //  bkg_Channel1.AddShapeFactor("sf");
  bkg_Channel1.AddNormFactor("Norm_bkg_Channel1", 1.0/nEntries_Channel2, 1e-9, 1, true); //normalization of #hist
  meas.AddConstantParam("Norm_bkg_Channel1");

  bkg_Channel1.AddNormFactor("nBkg_Channel1", nEntries_Channel1, 0, nEntries_Channel1*5);

  Channel1.AddSample(bkg_Channel1);

Note that I am also providing the same starting shape for both samples, and they share the same internal name.
And this seems to produce reasonable results. But I was not sure if what ROOT was doing internally was what I was hoping for i.e:

1.The same background shape is shared between the two channels, with independent normalizations.
2.There is no penalization for the variation of the background shape in Channel 2, whereas there IS such a penalization in Channel 1.

I have verified point 1. But I am not 100% sure of point 2. Can you please confirm this for me?

Thanks a ton!,
Arvind.

Hi,

I don’t think there is a penalisation (i.e. a constraint) in Channel1 for your model. You would need to add an OverallSys for normalisation uncertainty or a ShapeSys for a bin by bin uncertainty.

Lorenzo

Hi,

I’m back. I wanted to report that I was very wrong when I said that the background shape is shared even when the ShapeFactor is used only in one channel.

I was judging this by eye and was being fooled. The shapes were similar but not the same. My takeaway is that if I want the same shape to be used in all my channels, I have to use the common ShapeFactor in all the channels. The normalizations can be kept independent between the channels

Apologies for the confusion and many thanks to @moneta for the help.
Arvind.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.