HistFactory, significance, and luminosity


I’m trying to use hist2workspace (in the attached code, HistFactory’s MakeModelAndMeasurementFast) and StandardHypoTestDemo.C to compute the discovery significance from a pair of signal and background histograms, but HistFactory seems to fail when I increase the luminosity (Lumi) much at all. Can you help me understand what might be going wrong and how to work around it? Here’s a summary of what I’m doing:

  1. Construct a signal histogram and two background histograms. Each is of type TH1D, has ~10 bins (varies among test cases), and has Sumw2() called before being filled with unweighted data. All bins have nonzero background, and the signal histograms of most test cases have no empty bins.

  2. Scale each histogram by calling Scale(luminosity * cross_section * efficiency / integral_of_histogram). The cross sections and efficiencies are derived from simulation and are O(0.01) … O(1).

  3. Construct a background histogram that is the sum of background_1 and background_2, and construct a data histogram that is the sum of signal and background. Write the signal, background, and data histograms into a .root file.

  4. Set up a RooStats::HistFactory::Measurement (with Lumi = luminosity), a Channel, and two Sample instances, and call MakeModelAndMeasurementFast, which creates a RooWorkspace. (Many details omitted for brevity; please see the code.)

  5. Call StandardHypoTestDemo with calcType = asymptotic and otherwise default arguments.

And here’s what I’m seeing:

For luminosity 1.0, I don’t see minimizer errors, and I get a low but sensible significance. For higher luminosities up to ~5 … 7, the significance drops and eventually reaches negative zero ("-0"). I can increase the cross section, efficiency, and counts of raw events, but I just can’t seem to compute a significance for luminosities beyond about 10–it quickly goes to NaN. I’m interpreting the failure to compute a significance as symptomatic of HistFactory having a problem (which I cannot determine) with my input. At first, warnings like this start to appear in MakeModelAndMeasurementFast’s output, and become more prevalent with increasing luminosity:


Then, warnings like this start to appear:

“WARNING - - ABOVE PARAMETER IS AT LIMIT.” [referring to SigXsecOverSM]

And for luminosities around 10, I start seeing a lot of this:

“WARNING:Minization – RooMinimizerFcn: Minimized function has error status.”
“Returning maximum FCN so far (…) to force MIGRAD to back out of this region. Error log follows”
“getLogVal() top-level p.d.f evaluates to zero”

I found a forum post suggesting setAttribute(“BinnedLikelihood”), but since MakeModelAndMeasurementFast creates the RooWorkspace, I’m not able to implement the suggestion without modifying MakeModelAndMeasurementsFast.cxx.

I’ve also noticed the output “WARNING: a likelihood fit is request of what appears to be weighted data.”, but for the same reason as above, my code doesn’t have access to the pdfs (to call SumW2Error() as suggested) until after the failure has occurred.

I’m attaching a self-contained test program that demonstrates everything I’m doing. Please be sure to adjust the #include paths as appropriate for your system, both at the top and at the bottom of the source file. To run it, just launch ROOT (I wrote it for version 6.10.00) and execute the following:

.L repro-significance-lumi.C

Where 12345 is your preferred luminosity (say 1, 5.5, 10, 35000, etc.; the cross sections are in picobarns and so the luminosity is implicitly pb^-1). You’ll see the output from StandardHypoTestDemo.C just before the prompt to launch Model Inspector.

Please let me know what you think, or if you have any insight! I should admit up-front that for most questions about why I’m doing something or not doing something, the answer will be that I’m only following an example I was given, which itself seems to be based on the official example.

P.S. If you’re at CERN, I am too this week, and I’d be happy to meet up in person!

repro-significance-lumi.C (11.1 KB)

I read some of the HistFactory code and now I better understand how they use Lumi. They seem to multiply signal and background by it (as a constant, unless “Lumi” is not specified as constant and NormalizeByTheory=“False” explicitly), along the lines of how a sample can be multiplied by a NormFactor.

One mistake I was making, though, is that they don’t multiply the data histogram by Lumi. This means, to use StandardHypoTestDemo (which requires a data histogram commensurate with the signal and background), I should scale signal, background, and data on my own, and set Lumi=“1.0”. Doing both–scaling them and also setting Lumi to the luminosity–doubly multiplies signal and background and so is a very incorrect thing to do.

However, I’m still having the significance explode for L at or below 35000 pb^-1. I might be using an incorrect background process with too low of a cross section. The investigation continues.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.