Computing yields from fit

carlsonbt · August 16, 2013, 11:24pm

Hi
Say I have a histogram and I do a fit to a Gaussian. If I were to write something like:

        RooRealVar x1("x1","x1",8,11);
	RooRealVar sigma1("sigma1","sigma1",0.1,0.05,0.2) ;
	RooRealVar mean1("mean1","mean1",9.46,8,11) ;
	RooGaussian gauss1("gauss1","gauss1",x1,mean1,sigma1) ;


	RooDataSet* data1 = gauss1.generate(x1,1000);

       gauss1.fitTo(*data1);

I would generate a Gaussian with 1000 events, and I would do a fit to my toy histogram, and get a nice result. What I cannot figure out how to easily do is extract that yield - which should be 1000 events.

According to this post: Integration constants in a Breit-Wigner

if you want a signal yield, you have to add a step:

	RooRealVar n1("yield1","number of events",1000,0,10000000) ;
        RooExtendPdf egauss1("egauss1","extended gaussian PDF", gauss1,n1); 
        gauss1.fitTo(*data1,Extended(kTRUE));

But this is problematic, because now the fit returns the yield exactly. The uncertainties look right. But with no background term, this is not really useful at all - because an Extended ML fit returns the number of events you gave it.

Is there no other way to get the signal yield (using an extended fit or not?) A PDF shape does not tell you the whole picture in a fit - there must be a normalization term used by RooFit somewhere, right?

The only solution I have found is to add an arbitrary background term. But then, how close the fit yield comes to the actual yield depends on the backgrounder term, which adds another level of confusion and complexity.

To put this issue in context, I have a complicated method that produces a shape (let’s call it G’) generated from a dataset. I would like to take the dataset, divide it into N parts, build the shape on 1 part of the dataset, and perform a fit of the shape to the data on the remaining N-1 parts. What I would hope to see, if my method for building the shape is correct and my fitting procedure is correct is a unit Gaussian pull (where by pull I mean [yield fit - actual yield]/yieldE).

But… the difference in yields is always going to be artificially small with an Extended ML fit. Because I had to add a background term to deal with this, the width becomes dependent on the background term.

Ben

moneta · August 19, 2013, 9:35am

Hi,

[quote][quote]
But this is problematic, because now the fit returns the yield exactly. The uncertainties look right. But with no background term, this is not really useful at all - because an Extended ML fit returns the number of events you gave it. [/quote][/quote]

I don’t understand why this is a problem. You get the right yield you have put as input.
What you have to do, since you are doing an extended fit, is to generate toys according to a Poisson distribution around N. Then you would get the right distribution of (yield fit - true yield )/ error

Lorenzo