Problems with conducting an upper limits calculations using HypoTestInveter

Hello!
I have got some problems with calculating upper limits for mu parameter using HypoTestInventer and HybridCalculator. In my experiment there is no uncertanties so I dont include them into my calculations.
The problem is that I get absolutely inadequate results. For example, in the case of s = 10 signal events and b = 100 background events I get expected upper limits for mu using formula obtained from famous Cowan et al paper (arxiv:1007.1727) mu <1.74, but my code gives me mu < 2.4 (both 95% CL), for s = 7 and b = 70 I get mu < 2.8.
I have several concrete questions:

  1. What is possibly wrong in my code?
  2. When I change the range of mu parameter in my px distribution the result of my calculations changes as well. Why? How is result affected by varying this parameter?
  3. As return of the function below I am trying to get the value of upper limits using GetExpectedUpperLimit(nsig = 0) function. But it has nothing to do with the outcome in GetInterval() (for example, in terminal the value of mu is 2.25, but the return of the function is 3.9). Changing nsig parameter only makes if worse. What is the correct way of obtaining the result of HypoTestInverter?

Looking forward for your answer. Thank you very much in advance!

Here is my code:

def MC_UL (s, #number of signal events
                    b, #number of background events
                    ntoys, #number of generated MC toys 
                    verbose = 0,
                    mu =1, #signal strength parameter(selected specifically for hypothesis)
                    
                    plot = False):
    
    #calculate estimators

    b_est = ((s+b + b - 2*mu*s)/4 + np.sqrt((np.power((s+b + b - 2*mu*s),2) + 8*b*mu*s)/16))
    
    wspace = ROOT.RooWorkspace("wspace")
    
    # wspace.factory("mu[0,5]")
    #Define distribution in signal region (Poisson(mu*s+b))
    
    
    dist = "Poisson::px(x["+str(s+b_est) + ", 0, 500], sum::splusb(mu[1.,0.,7.]*s[" + str(s) +".],tau[1.]*b_est["+ str(b_est)+"]))"
    #dist_CR = "Poisson::py(y["+str(b_est)+",0,500], prod::taub(tau[1.],b["+str(b_est)+",0,300]))"
    #wspace.factory(dist_CR) 
    wspace.factory(dist)
    #Define observables, parameters of interest and nuisance parameters
    wspace.defineSet("obs", "x")
    wspace.defineSet("poi", "mu")
    wspace.defineSet("nui", "b_est")

    #Create space for our toy data to storage
    data = ROOT.RooDataSet("d", "d", wspace.set("obs"))
    data.add(wspace.set("obs"))

    #Construct and configure alternative and null models

    #Null(background only model)
    b_model = ROOT.RooStats.ModelConfig("f(q_{\mu}|\mu)", wspace)
    b_model.SetPdf(wspace.pdf("px"))
    b_model.SetObservables(wspace.set("obs"))
    b_model.SetParametersOfInterest(wspace.set("poi"))
    wspace.var("mu").setVal(1.)
    wspace.var("b_est").setVal(b)
    b_model.SetSnapshot(wspace.set("poi"))

    #Alternative(signal + background model)
    sb_model = ROOT.RooStats.ModelConfig("f(q_{\mu}|3)", wspace)
    sb_model.SetPdf(wspace.pdf("px"))
    sb_model.SetObservables(wspace.set("obs"))
    sb_model.SetParametersOfInterest(wspace.set("poi"))
    wspace.var("mu").setVal(1.)
    wspace.var("b_est").setVal(b_est)
    sb_model.SetSnapshot(wspace.set("poi"))

    #Define test statistic that is going to be used for testing the hypothesis
    teststat = ROOT.RooStats.SimpleLikelihoodRatioTestStat(sb_model.GetPdf(), b_model.GetPdf(), sb_model.GetSnapshot(), b_model.GetSnapshot())

    #Construct hypothesis test

    hc2 = ROOT.RooStats.HybridCalculator(data, b_model, sb_model)
    toymcs2 = hc2.GetTestStatSampler()
    toymcs2.SetNEventsPerToy(1)
    toymcs2.SetTestStatistic(teststat)
    hc2.SetToys(ntoys, ntoys)
    #hc2.ForcePriorNuisanceAlt(wspace.pdf("py"))
#hc2.ForcePriorNuisanceNull(wspace.pdf("py"))

    invtest = ROOT.RooStats.HypoTestInverter(hc2, wspace.var("mu"), 0.05)
    r2 = invtest.GetInterval()  
    r2.IsOneSided()
    if (verbose >0):
        #Print the results
        print("RESULTS:")
        r2.Print()

    if (plot):
        #Draw the plot
        dist = r2.GetAltTestStatDist(1)
        c1 = ROOT.gROOT.Get("c1")
        if not c1:
            c1 = ROOT.TCanvas("c1")
        plot = ROOT.RooStats.SamplingDistPlot(500, -6.0, 8.2)

        plot.AddSamplingDistributionShaded(dist,-10, 8)
        plot.SetMarkerColor(4, dist)

        plot.AddLine(4.9654, 0.5, 4.9654, 0, "line")
        plot.SetAxisTitle("q_{\mu}")

        legend = ROOT.TLegend(0.1, 0.9, 0.3, 0.8)
        legend.AddEntry(dist, "f(q_{\mu})")
        legend.AddEntry("line", "observed test statistic value", "L")

        p2 = ROOT.RooStats.HypoTestInverterPlot("plot", "MC generated distribution of q_{\mu}", r2)
        plot.SetLegend(legend)
        plot.Draw()
        c1.SaveAs("exclusion_"+str(s)+"_"+str(b)+".png")

    
    return r2.GetExpectedUpperLimit(nsig = 0)

Dear Oleg,

Thanks for the interesting post!
I add in the loop our RooStats experts, @jonas @moneta .

Cheers,
Danilo

Hi,

First of all, how do you computed the expected upper limit to be mu<1.74. There are several formulas for computing upper limits depending on the test statistics used and if you are using the CLs procedure or not.
If you use the ProfileLikelihoodTest statistics in RooStats you will get the results in the paper described in paragraph 3.7.

Note that in your code you define a background model with mu=1, it should be equal zero. Then I would suggest you to improve the scan you are performing, by calling HypoTestInverted::SetFixedScan(npoints,mu_min, mu_max)

You can use also the tutorials StandardHypoTestInvDemo where you pass the workspace and you can configure for the type of test statistics and procedure to use.

Lorenzo

Thank you very much for your answer and advises! Much appreciated!

Still I would really like to understand the dependence of final result on varying the range of mu parameter. Could you give me some explanation, please? Or is there any useful pages or tutorials that I can read myself on this question?

Hi,

The result should not depend on the range of the parameter, if the range is large enough and include the lower/upper limit values. In your case if you define the range in mu between [0,5] or [0,7] or [0.10], it should not make a difference. If it does it is because the scan is not well done. As I said you can try using a fixed scan and a large enough number of points which are around the expected value for the limit

Okay, I got it. Thank you very much again for your help!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.