RooStats:counting experiment with large numbers and Edm is above max

sumanta · August 3, 2018, 3:51pm

Hi Lorenzo,

There was a discussion before for large numbers in counting experiment. Your suggestion was to re-define the parameters (signal and background) close to 1. I am having similar issue but not sure how to handle this while dealing with RooDataHist.
Say, I have one histogram (TH1) for signal and another for background. The X axis of these histograms are energy and Y axis is counts (/kg/day/keV).

I´ve set ‘Energy’ as RooRealVar; ‘Energy’ is an observable.
Energy.setBins(100);
w->import(Energy);
w->defineSet(“obs”,“Energy”);

// P.O.I //
RooRealVar mu_sig(“mu_sig”,“signal …”,-100,5000) // Intentionally kept this huge range as expected signal counts is a large number
w->import(mu_sig) // w is the RooWorkSpace
w->defineSet(“poi”,“mu_sig”)

// Nuisance parameter //
RooRealVar mu_b(“mu_b”,“background”,0,50000)
w->import(mu_b)
w->defineSet(“nuis”,“mu_b”)

// Global observable //
RooRealVar a_b(“a_b”, “Expected backgrounds”, 1)
w->import(a_b)
w->defineSet(“gobs”,“a_b”)

// Signal Model //
hEnergy_sig … This is a TH1D histo which shows the signal histogram (bin width = 1 keV)
signal_rate = hEnergy_sig->Integral() * FiducialMass * ExposureDay
RooRealVar signal_counts (“signal_counts”, "signal counts expected ", signal_rate)
w->import(signal_counts)

// Create RooFit model PDF //
RooDataHist DataHist_sig(“DataHist_sig”,“signal RooDataHist”,*w->set(“obs”),hEnergy_sig);
RooHistPdf HistPdf_sig(“HistPdf_sig”,“signal RooHistPdf”,*w->set(“obs”),DataHist_sig);
w->import(HistPdf_sig);

/**************

if Fiducial Mass = 1000 kg, ExposureDay = 1000 days then signal_counts ~ 10^6 …a huge number
/**************

// create background model //
hEnergy_bg : it is the background histogram (bin width 1 keV)
RooDataHist DataHist_bkg(“DataHist_bkg”,“bkg RooDataHist”,*w->set(“obs”),hEnergy_bg);
RooHistPdf HistPdf_bkg(“HistPdf_bkg”,“bkg RooHistPdf”,*w->set(“obs”),DataHist_bkg);
w->import(HistPdf_bkg);

bkg_rate = hEnergy_bg->Integral() * FiducialMass * ExposureDay

w->var(“mu_b”)->setVal(bkg_rate);
w->var(“a_b”)->setVal(bkg_rate);
RooRealVar sigma_b(“sigma_b”,"sigma ", 0.1*bkg_rate);
w->import(sigma_b);
w->factory(“Gaussian::Constraint_b(mu_b,a_b,sigma_b)”);

/**************

if Fiducial Mass = 1000 kg, ExposureDay = 1000 days then bkg_rate ~ 10^4 …a huge number, and sigma_b also high ~10^3
/**************

//>> Event probability model
w->factory( " SUM::EventModel( mu_sigHistPdf_sig , mu_bHistPdf_bkg) " );

//>> Constraint functions
w->factory( " PROD:: Constraints(Constraints_b) ");

//>>Select probability model
w->factory( " PROD::TotalEventModel(EventModel,Constraints) ");

//>>Create configuration file
// SIGNAL plus BACKGROUND model
ModelConfig* SBmodel = new ModelConfig(“SBmodel”, w);
SBmodel->SetName(“SBmodel”);
SBmodel->SetPdf(*w->pdf(“TotalEventModel”));
SBmodel->SetObservables(*w->set(“obs”));
SBmodel->SetParametersOfInterest(*w->set(“poi”));
SBmodel->SetNuisanceParameters(w->set(“nuis”));
SBmodel->SetGlobalObservables(w->set(“gobs”));
RooRealVar firstPOI = (RooRealVar) SBmodel->GetParametersOfInterest()->first();
firstPOI->setVal(0.1);
SBmodel->SetSnapshot(*firstPOI);
w->import(*SBmodel);

Here, main idea is to get the sensitivity study assuming we understand the background.
Problem is when taking fiducial mass 1000 or 2000 kg and 1000 days exposure time in actual experimental view, then expected number of signals and backgrounds are too high. So, while scanning over mu_sig (POI), I get following message for every try of mu_sig: Minuit2Minimizer::Minimize : Minimization did NOT converge, Edm is above max
So, at the end errors (1 sigma or 2 sigma) of the median value is also very high.
Now, as you have pointed out before this is a numerical problem which causes this fit failure.
But I am not sure how to re-define my parameters (signal and background) close to 1 with a constant scaling factor.

Regards,
SP

moneta · August 7, 2018, 12:26pm

Hi,

You would need to redefine the bkg_rate and sig_rate as product of a constant factor and a RooRealVar. You could use for example RooFormulaVar for this.

Lorenzo

sumanta · August 7, 2018, 7:02pm

Hi Lorenzo,

I did like this:
float constant_factor = 1.0E-3;
RooRealVar c_bkg (“c_bkg”, “count bkg”,hEnergy_bg->Integral() * FiducialMass * ExposureDay);
w->import(c_bkg);
bkg_rate = constant_factor * w->var(“c_bkg”)->getValV();

and similar for signal_rate.
But I see in the p-value plot, the 1sigma and 2sigma band of p-value becomes flat after certain mu_sig (this is the POI), i.e., parallel to the x-axis (mu_sig). So, no final minimization.
In previous case, I used the range for mu_sig as 0 to 5000, where p-value reaches 0 after 2000.
So, after renormalising with the costant_factor (which is 1E-3) I used this range for mu_sig: 0 to 5. But the band was above the 90% C.L. even if I increase the scan range of mu_sig, I see that flat behaviour as mentioned above.

any clue?

could you please write the use of RooFormulaVar using the syntax for this example code. I could not figure out how to implement that as I am new comer in RooStat.

Thanks,
SP

sumanta · August 13, 2018, 1:51pm

Hi Lorenzo,

I am getting result with the modifications which I mentioned in my previous post. I did a mistake while giving scan range for ‘mu_sig’ to the code.

Now, I do the following:
In a header file, I set my range for mu_sig: MIN_POI & MAX_POI.
Then in my workspace macro file,
I have re-defined mu_sig like this,
RooRealVar mu_sig(“mu_sig”,“signal …”, MIN_POI * Constant_Factor, MAX_POI * Constant_Factor*1.5 )
» Extra ‘1.5’ factor is just to give additional range to mu_sig above MAX_POI.

But I have a query about the final result:
Say, I use 3 different Constant_Factors (CF1,CF2,CF3) where all other parameters are identical.
Now, for each choice of CF´s, I get three different Median values (M1,M2,M3).
Now, I use following values for CF´s.
CF1 = 1E-3
CF2 = 5CF1 = 5E-3
CF3 = 10CF1 = 1E-2
After scaling back these Medians new Median values are M1_s = M1/CF1 and so.
But, M2_s not equal to 5*M1_s, rather M2_s = 0.45 *M1_s
and M3_s = 0.68 *M2_s

I am sure I am missing something while renormalising back my median value (limits); otherwise the use of ‘Constant_Factor’ is in doubt.

Thanks,
Sumanta

system · August 27, 2018, 1:54pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

moneta · August 28, 2018, 8:22pm

Hi,
Sorry for my late reply. If I have understood you well, you are saying that by changing the scaling factors you are getting results which are not consistent between each other.
I think there could be some errors (maybe numerical) in computing the median limit.
I would need the full runnable program to understand the reason for this error

Lorenzo