Number of signal & bg events and yield for comparison

Dear Root forum,

Suppose I have an ntuple root file (signal mc) that I have got with some preliminary cuts. Now , I have the roofit. How do I get the signal yield, background yield, number of signal and background events from roofit so that I can apply few selection cuts and compare it before and after applying?

Also, based on this can I calculate things like efficiency, purity etc…? (I haven’t found anything useful in this regard!, So any useful examples for this would be really helpful)

Hello @Vikas_Raj,

sorry for the very late reply and that you had to re-post! I deleted your new post because it was the same and write the answer here.

We have an example on how to fit models with signal and background events using the RooAddPdf in an extended fit:
https://root.cern.ch/doc/master/rf202__extendedmlfit_8C.html

But on your concrete question: I think it is a bit weird. Normally, once you do a fit, you don’t even care anymore about efficiency and purity, because the discrimination power lies in the model shapes, not in the number of events. That’s the whole point of fitting pdfs. So at this point, what you care about is usually the uncertainty in your parameters of interests directly.

But if you want to get efficiency and purity in the whole fit region anyway, you can calculate them from the post-fit yield parameter (taking the parameter names from the tutorial as an example):

// do the fit here on your dataset before the cut
model.fitTo(*data);
// save the post-fit values
double nsigVal1 = nsig.getVal();
double nbkgVal1 = nbkg.getVal();

// do the fit here on your dataset after the cut
std::unique_ptr<RooAbsdata> dataReduced{data->reduce("your cut goes here")};
model.fitTo(*dataReduced);
// save the post-fit values
double nsigVal2 = nsig.getVal();
double nbkgVal2 = nbkg.getVal();

// calculate your metrics
double efficiency = nsigVal2 / nsigVal1;
double purity = nsigVal2 / nbkgVal2;

Like this you would get the signal efficiency of the cut and the purity after the cut. But again, usually you are interested in something more specific, i.e. parameter uncertainties, so why not compare and optimize on these?

I hope that helps, but feel free to follow up if I misunderstood your question or you still have doubts!

Cheers,
Jonas

Hmm, what I was saying that like after fitting, by using RooIntegral, I get the total events. Now, if I want to do a cut and count analysis I need the number of signal events out of these! So that is why I want to know the “S/sqrt(N)” before and after fitting

Hmm, what I was saying that like after fitting, by using RooIntegral, I get the total events.

The integral of what? If you integrate a pdf in RooFit, you get 1 by definition because pdfs are normalized.

Now, if I want to do a cut and count analysis I need the number of signal events out of these!

How did you build your model actually? Do you use the RooAddPdf to add signal and background, like in the tutorial I linked? Then the number of signal events is simply the coefficient of the signal pdf.

So that is why I want to know the “S/sqrt(N)” before and after fitting

That’s another thing I don’t understand: “the S/sqrt(N) before fitting”, what is that exactly? If S and N are parameters of the fit, they don’t even have a defined value before the fit? Or you mean S and N from Monte Carlo simulation?

Please be more specific about what you are doing so I can give advice :slight_smile: I think it would help a lot if you upload some of the code you have, to add to your explanations.

Cheers,
Jonas

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.