Frequentist Calculator with simultaneous pdf

Dear RooStats experts,

I am having an issue when trying to set limits using the frequentist calculator on my combined model for two channels. I have two poisson model set for each of the channels and then forming a simultaneous pdf.

It seems that the calculator fails when trying to generate toys for the combined data set that I give it as an input I tried to follow what was done here: https://root.cern/doc/v608/classRooStats_1_1HistFactory_1_1HistoToWorkspaceFactoryFast.html#a78fe617dd023e3a4139f2fefde94fb93 to make my combined model.

I get the following error:

Traceback (most recent call last):
  File "simpleCombine.py", line 114, in <module>
    limit = simpleComb(nObs1, nBkg1, np1, nObs2, nBkg2, np2)
  File "simpleCombine.py", line 89, in simpleComb
    htr = hypoCalc.GetHypoTestCalculator().GetHypoTest()
SystemError: RooStats::HypoTestResult* RooStats::HypoTestCalculatorGeneric::GetHypoTest() =>
    problem in C++; program state has been reset

I have attached a simplified version of my code that reproduces this error.simpleCombine.py (4.1 KB)
I do not get any errors when I try to use the asymptotic calculator.

Thanks in advance for your help,
Deshan

Dear experts,

@aaronsw was digging into this and it looks like it’s failing at:
line 351 in https://root.cern.ch/doc/master/ToyMCSampler_8cxx_source.html#l00381

I wonder if this helps find out what the problem might be?

Thanks,
Deshan

Hi @dabhayas,

It fails in the ToyMCSampler, indeed, and the reason is that the dataset “model1Data” is supposed to be split along the category dimension “channel”. However, “model1Data” only consists of

DataStore model1Data (Generated From model1)
  Contains 1 entries
  Observables: 
    1)  nObs1 = 11  L(0 - 1000)  "nObs1"
    2)  nObs2 = 168.572  L(0 - 1000)  "nObs2"

And it’s not flagged as having a category “channel”. It looks like something “forgets” the category when generating data, but I don’t know yet where this happens.

Update:
The problem with the ToyMCSampler is that it needs to generate toy events. Since a simultaneous PDF is involved, one needs to decide in which category to generate an event. This only works if the components of the simultaneous are extended PDFs, because only then it is clear (inferred from the relative fractions of each component) for which category an event should be generated. Since both the Poisson and the Gaussian PDFs in each component are non-extended PDFs, this requirement is not fulfilled. Therefore, no usable dataset is generated (basically, it just generates events for the category that was active last), and subsequently this leads to a crash.
RooStats doesn’t understand what the problem is and keeps running. I will check if it’s possible to add a safety check and some more intelligible error message.

That doesn’t fix your problem, though. Does it make sense that generating events for your model is not possible, or do you see a way in which we could generate events also for non-extended components?

Hi Stephan,

Thanks for your help on this. Your explanation makes sense, however, I’m not entirely sure how I would go about generating events for the non-extended components.

Thanks,
Deshan

That’s probably not necessary. I think I have a solution how to rewrite this model. Instead of a simultaneous PDF (that’s only necessary when you have a different number of events for the same observable), use a product of pois1pois2constraints:

w.factory("PROD:jointModel(pois1,pois2,g1,g2)")

Then, since it’s a counting model, you need to generate only a single event for every toy run:

toymcs.SetNEventsPerToy(1)

For me, it starts running, but I suspect that the ranges of some variables/parameters are wrong, as some NaN are popping up. Maybe a little tweaking will get it to run.

Hi Stephan,

Thanks for clarifying this. This does indeed seem like a better way of doing it.

Would I still be able to use the same dataset I defined before? How would it be able to fit this data set without the any category defined in the model. It’s not clear to me how I would construct the dataset needed.

Thanks,
Deshan

Hi Deshan,

you indeed have to adjust the dataset. A 2D dataset with a single entry
{nObs1, nObs2}
should do the job.

Try this:

    w.defineSet("nObs", "nObs1,nObs2")
    nObsSet = w.set("nObs")
    data = ROOT.RooDataSet("data", "observed data", nObsSet)

    w.var("nObs1").setVal(nObs1)
    w.var("nObs2").setVal(nObs2)
    data.add(nObsSet)

Hi Stephan,

Okay thanks a lot! this seems to do a trick :slight_smile:

For me, it starts running, but I suspect that the ranges of some variables/parameters are wrong, as some NaN are popping up. Maybe a little tweaking will get it to run.

Regarding this point, it seems when I set the scan for the HypoTestInverter to hypoCalc.SetFixedScan(100, 0, 10, True) I get the error

[#0] ERROR:Eval -- RooAbsReal::logEvalError(nExp1) evaluation error, 
 origin       : RooAddition::nExp1[ nSig + bkgProd1 ]
 message      : function value is NAN
 server values: !set=(nSig = nan,bkgProd1 = -13.2727)
[#0] ERROR:Eval -- RooAbsReal::logEvalError(nExp2) evaluation error, 
 origin       : RooAddition::nExp2[ nSig + bkgProd2 ]
 message      : function value is NAN
 server values: !set=(nSig = nan,bkgProd2 = 10.5)
  FUNCTION VALUE DOES NOT SEEM TO DEPEND ON ANY OF THE 3 VARIABLE PARAMETERS.
          VERIFY THAT STEP SIZES ARE BIG ENOUGH AND CHECK FCN LOGIC.

however if I start my scan slightly higher e.g. hypoCalc.SetFixedScan(100, 0.01, 10, True) or hypoCalc.SetFixedScan(100, 0, 10, False)
It works fine. I guess is due to asking for a log spacing in the scan and starting at 0?

Totally! Very good observation. I will have a look at the HypoTestInverter to make sure that this doesn’t happen.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.