Expected P-Values In CLs Method

draggeddown · December 10, 2020, 5:16am

hey!
So I’m using the FrequentistCalculator in conjunction with the HypoTestInverter for 1-D limit setting. I am doing a scan over 30 points for my parameter of interest. I am trying to extract the values of the expected CLs along with the observed CLs. Now I know that there’s a function called GetExpectedPvalues(), but it returns a distribution for EACH point in the scan. What exactly is this? Shouldn’t there be one expected p-value per scan point? Atleast that’s what is plotted right?

So how exactly is that single expected p-value calculated in ROOT and how do I access it?

Thanks in advance!

moneta · December 10, 2020, 10:00am

Hi,

To get the median expected p-value you would need to compute the sample median from the vector returned by HypoTestInverterResult::GetExpectedPvalues().
You can see the code in the implementation of HypoTestInverterPlot::MakeExpectedPlot().
But you can just use the TGraph returned from MakeExpectedPlot:

HypoTestInverterPlot pl(result); 
auto g = pl.MakeExpectedPlot(0,0); // only draw median
double *  expected_p_median_values = g->GetY();

Best regards

Lorenzo

draggeddown · December 10, 2020, 10:24am

Hey!

Thank you so much for your answer. So at the moment as I understand it, in ROOT, there is no particular way to do a 2-D limit setting right? As in 2 parameters of interest at the same time? So I would have to store my observed and expected CLs values for one whole run of one PoI, and then change the PoI and repeat the process again for another PoI?

moneta · December 10, 2020, 10:40am

A 2D limit setting is a contour. You can use in RooStats the ProfileLIkelihoodCalculator for the asymptotic case. For the Frequentist case you can use the FeldmanCousins class, see

Lorenzo

draggeddown · December 11, 2020, 2:46am

Thanks a lot for this, I was able to use the FeldmanCousins class and the associated plotting technique from that tutorial. Just to confirm my own understanding of the method implemented in the code, what it essentially does is use the ModelConfig that is passed to it, to
a) Create pdfs for the theta=0 and theta = non-zero cases for various scan points, evaluate a test statistic value by using the data that we pass to it, and subsequently calculate p-values from those distributions that were gotten by ToyMCSampler.
b) The CLs by is calculated by using those p-values and the parameter point is excluded from the interval if the value obtained is (say) less than 0.1 (example test size in the tutorial)
c) These collection of points are then plotted as a 2-D histogram (so I am assuming that each point in the interval is of the type (theta1, theta2) )

Am I correct in this understanding? And for this limit setting, is the official ROOT recommendation to keep the profile likelihood ration two sided or one sided?

moneta · December 11, 2020, 9:02am

Hi,

The FeldmanCousins tutorial performs a Neyman construction. For the given test statistics, which is the 2 -sided profile likelihood, it scans the parameter points, computes the test statistic distribution and find the confidence belt. From the found confidence belt and the observed test statistics distribution you can tell if the parameter point is inside or outside the interval.
For FeldmanCousins, there is no CLs, but one uses only the p-value computed by the S+B model.
The test statistics is the 2-sided because in more than one dimension you cannot define a one-side test statistics.
So here are my answers to your question:

a) For FC we just need the full model (S+B) , no need for a model with S=0
b) There is no CLs in FC. One uses the 2-side test statistics to find the acceptance region and see if the observed test statistic value is inside or not in the acceptance region. Doing this one computes the confidence belt.
c) The collection of points, are the parameter values points that are inside the interval

Lorenzo

draggeddown · December 11, 2020, 9:25am

Hey,

Thanks again for that detailed answer. So without an S=0 model, all we do is look at the p-value obtained for various parameter points? Wasn’t the CLs method introduced because there are some issues with this, in the sense that the upper limit can get arbitrarily small and as physicists we can’t claim sensitivity to arbitrarily small signals? Is the CLs method still to be implemented in the FeldmanCousins class in ROOT?

moneta · December 11, 2020, 11:09am

Hi,
CLs is a method for upper limit introduced for the reason you mentioned before, but it is for 1D problem. There is no such method for 2D contours, there is different. The only frequentist valid procedure for the 2D case is Feldman-Cousins.
One could think of generalising what described in this paper for the 2D case with the asymptotic approximation and when you have bounds. Without bounds you can simply use the ProfileLikelihoodCalculator (or the Contour method of Minuit/Minos)

Lorenzo

draggeddown · December 11, 2020, 11:19am

Hi! Thanks a lot for that, I think I understand now. I’ll have to look more into the theory of 2D contous.

draggeddown · December 15, 2020, 10:48am

Hi! Thanks again for that reply. I’m having some issues implementing the FeldmanCousins class. So I am trying to make a contour plot, but the tutorial currently fills the 2D histogram with a value of 1 whenever a point is in the interval and 0 when it is not. I want to a plot that looks like the one you get from ProfileLikelihoodCalculator. For this I assume that instead of the value 1, I should fill the p-value of that parameter point. How do I access this value? I have not been able to find a convenient way. I’m guessing it’s somewhere in the interval object but could you let me know how one accesses that value? And is this the way to make a plot that looks like the plot from ProfileLikelihoodCalculator?

moneta · December 17, 2020, 10:18am

Hi,

Unfortunately in the case of FeldmanCousins class, the way the limit is computed is different. What the class computes it is not a p-value for each scanned point, but it creates confidence belts, i.e. for each parameter point that is scanned (e.g a 2d grid), an interval in the test statistics defining a lower and upper values for the defined confidence level. The intersection of these regions with the observed test statistics value defines then th points which are in the interval. The distributions of this points will be like a contour.
To make a contour then you would need to scan for a larger number of points. What you can do, after a first pass, is to provide to the FeldmanCousins a set of number of points to scan, using FeldmanCousins::SetParameterPointsToTest, so you will avoid scanning points you are sure are inside or outside the interval and you concentrate close to the edges of the contour.
This could eventually be implemented automatically by the class , but unfortunately it is not yet done.

Best regards

Lorenzo

system · December 31, 2020, 10:18am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.