RooStats HybridCalculator, CLs, HypoTestInverter

simonyan · December 6, 2011, 6:10pm

Hello

I am trying to use HybridCalculator with SimpleLikelihoodRatioTestStat to do hypothesis test and invert the results. The plot of test statistic is attached. First hypothesis is tested and the result printed

Results HypoTestCalculator_result:

Null p-value = 0.5146 +/- 0.00499787
Significance = -0.0366049 sigma
Number of Alt toys: 10000
Number of Null toys: 10000
Test statistic evaluated on data: -13.0292
CL_b: 0.5146 +/- 0.00499787
CL_s+b: 0.9999 +/- 9.9995e-05
CL_s: 1.94306 +/- 0.0188723

I was expecting integral of test statistics calculated from -infinity to data for S+B, but Ok, this is a conventions and one can always subtract it from 1. How the inverter is going to deal with this if one uses CLs? Calling HypoTestResult::SetPValueIsRightTail(false) does not help also, since it also affect p-value of background only hypothesis. Then,

HypoTestInverter * inverter = new HypoTestInverter(*hypoCalc)

which somehow flips S+B and B, then it complains

[#1] INFO:InputArguments – HypoTestInverter ---- Input models:
using as S+B (null) model : bConfig
using as B (alternate) model : sbConfig
WARNING:InputArguments – HypoTestInverter - using a B model with POI mu not equal to zero user must check input model configurations

and indeed POI = 0 should correspond to background only hypothesis. Looking at HypoTestInverter code I am confused by the flip and complaint. Any suggestion?

Thanks,
Margar

moneta · December 7, 2011, 1:57pm

Hi,

If you want to use the HypotestInverter class to find a limit (or an interval) use the S+B model as null model and B model for the alternate.
since in this case you are testing the S+B hypothesis.
If instead you want to estimate a significance of a discovery, the hypothesis you are testing is how much your background model is compatible with the data. So the null in this case is the B model and the alternate the S+B model.

Best Regards
Lorenzo

simonyan · December 7, 2011, 10:34pm

Hi Lorenzo

thanks for the reply, I have some further questions.

Let’s look at test statistics distribution (simple likelihood ratio Q) I posted. Q from data is right in middle of background only (null) distribution and signal (alternate) is obviously excluded with high confidence level. I want to reduce the signal and find the point corresponding to 95%. Why now S+B should become null and B become alternate hypothesis? Can’t the inverter just reduce signal and find the required point?

My other question from the first post is about CLs. Looking at Q distribution on can see that, integral of Q distribution from data to infinity assuming background only is ~0.5 (CLb). Assuming S+B the integral for the same range is ~1, and this is called CLs+b. Shouldn’t this be 1-CLs+b? With the current definition CLs = (CLs+b)/CLb ~2. This is the treatment in HypoTestResult. How the inverter defines CLs? I am not sure the flip solves this.

Best regards,
Margar

moneta · December 8, 2011, 9:08am

Hi,

The HypoTestInverter class will vary the signal strength only for the Null model, not for the alternate, so you cannot use it by passing S+B as alternate and B as null.
If you want to keep that convention (I don;t understand why) you will have to do the scan by hand and in that case you need also to set HypoTestResult::SetPValueIsRightTail(false).

In your figure you have the p value as the right tail, because you are using the inverted notation (B for null and S+B for alt). So CLb and CLs+b is the integral from -infinity to data.
CLs = CLs+b / CLb. CLs+b is very small for you and CLs will be then approximately 2 times CLs+b

Best Regards

Lorenzo

simonyan · December 8, 2011, 10:07am

Lorenzo

I wanted to keep that convention because it is more intuitive to have B as null and S+B as alternate.

Now I change it as you suggest and attach the test statistics distribution. After this the HypoTestResult is

Results HypoTestCalculator_result:

Null p-value = 3e-05 +/- 1.73202e-05
Significance = 4.01281 sigma
Number of Alt toys: 100000
Number of Null toys: 100000
Test statistic evaluated on data: 13.0292
CL_b: 3e-05 +/- 1.73202e-05
CL_s+b: 0.46954 +/- 0.0015782
CL_s: 15651.3 +/- 9036.32

p-value is still the right tail, both for null and alternate. My point is that CLs should be p-value of null (S+B) divided by 1 - p-value of alternate (B). I don’t see how CLs can be ratio of integrals of the same side. SetPValueIsRightTail(false) does not help because CLs will be still calculated as ratio of integral of the same side.

Thanks a lot,
Margar

moneta · December 8, 2011, 10:40am

Hi,

I see that your HypoTestResult class has the flag background as Alt set. So you need to do call also
HypoTestResult::SetBackgroundAsAlt(true). I guess you are not using the HypoTestInverter class directly, because in that case this is done automatically.
There are many definitions around of CLs, depending how CLb and CLs+b are defined.
In RooStats, CLs is defined as CLs+b/CLb, where CLb is the alternate p value (assuming B is the alternate model) and the test statistics are defined in a way such the p values are the right tail.

So in your figure, CLS is the ratio of the dashed red -area / blue -dashed area.

Lorenzo

simonyan · December 8, 2011, 11:36am

Lorenzo

I attach the relevant part of the code.
First, I would like to get meaningful CLs at the end of part 1.

As you see I am using HypoTestInverter directly. Does SetBackgroundAsAlt(true) make sure CLs is defined as ratio of integrals from different (with respect data) side for null and alternate?

Thanks,
Margar
tmp.C (1.82 KB)

moneta · December 8, 2011, 2:52pm

Hi,

At the end of part 1 you are doing a test of significance. CLs does not make sense in that context. CLs makes sense in the context of limit calculation and therefore you need in that case to have NULL (S+B) and ALT (B). You cannot get a meaningful CLs from HypoTestResult if you are using the opposite notation.
If you insist to use that notation then in that case you have to redefine your self CLs as : CLsplusb/( 1- CLb)

Lorenzo