RooMCStudy: pull calculation and getting generated values of parameters

amroo · December 28, 2019, 11:29pm

Hello all,

I am doing a toy MC study using RooMCStudy.

I have different generating and fit pdf. For my generating pdf, I am using the Extended() option which I believe samples the generated number of events according to Poisson distribution.

I can plot the pull distributions for all variables without any problems as solved in this post:

I want to know the generated value of the number of signal and background for each toy experiment. How can I do that?

I know how to obtain the fitted values of parameters for each toy experiment but no success in doing the same for generated values.

For my fit result I get the following message:
1> The fit parameter ‘n1f’ is not in the model that was used to generate toy data. The parameter ‘n1’=4529 was found at the same position in the generator model. It will be used to compute pulls. (n1 is signal yield). But since I am using the Extended() option, n1 generated should be different for every toy experiment ? The same problem with the background yields.

The relevant piece of code:
RooMCStudy *mcstudy = new RooMCStudy(simPdf,RooArgSet(x,y),Extended(),FitModel(simPdf_f),FitOptions(NumCPU(4),Save(kTRUE),PrintEvalErrors(0),Minos(1)));
mcstudy->generateAndFit(100);

Similar questions asked in this post:

clarification on the issue will be really helpfull
Thanks

StephanH · January 6, 2020, 4:35pm

Hi @amroo,

this information is saved with the fit parameters. The fitParams() or the fitParDataSet()
contain a variable / column that is called “ngen”. It holds the number of events generated in each run.

These could e.g. be plotted like this:

mcs.fitParDataSet().createHistogram(“ngen”, <numberOfBins>) ;

amroo · January 6, 2020, 11:49pm

Hi @StephanH Thanks for replying.
yes, I see there is a parameter ngen in fitParDataSet().
ngen is total number of events generated in a run

But how do I get the generated value of other parameters of pdf for which toy study is performed ?
I have something like pdf = Nsig*sig_pdf + Nbkg*bkg_pdf , ngen = Nsig+nbkg.
I want to get the value of nsig and nbkg generated for every run.

Also, what is the definition of pull used here (nsig_fit(i) - nsig_gen(i))/(sigma_nsig_fit(i)) ;
Is the nsig_gen(i) fixed or different for every i while calculating pull ?

StephanH · January 7, 2020, 9:27am

Hi,

unless fit parameters are constrained to a certain interval, the “target” values of the parameters when generating are always the same. Only the target value ngen is changing, all the rest is constant.

Then, events are randomly assigned to the two components with a probability that reflects the relative fractions of nsig and nbkg. The actual number of signal events in each dataset is unknown, but by construction really close to

nsig_gen = ngen * nsig/(nsig+nbkg)

and similar for the background.

The pull is (meas - truth)/delta(meas), and truth is fixed.

amroo · January 8, 2020, 3:52am

Thanks for the detailed clarification. This answers my question.

system · January 22, 2020, 3:52am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.