Strange results in rs301_splot.C?

Dear experts,
I’m trying rs301_splot.C with high statistics (100x of that in the original example) and see some disagreement in lower side of the QCD isolation distribution (see the bottom plot in the figure below). Is this expected?

The script I used is rs301_splot.C (11.9 KB)

Thanks!

1 Like

I am not sure it is related but in many places in the log output you get NaN and Inf for instance:

     getLogVal() top-level p.d.f evaluates to NaN @ !refCoefNorm=(), !pdfs=(zModel = 1.01611e-69/1,qcdModel = nan/1), !coefficients=(zYield = 50000,qcdYield = 100000)
     getLogVal() top-level p.d.f evaluates to NaN @ !refCoefNorm=(), !pdfs=(zModel = 3.76777e-91/1,qcdModel = nan/1), !coefficients=(zYield = 50000,qcdYield = 100000)

@moneta might clarify.

Thanks for the reply. The fit converges in the end and the final values are consistent with the input. So I guess it’s not due to fit. And I also tried run the sPlot without fit, in which case the the signal yield and background yield are their input values, similar discrepancy is observed.

Ok, in that case @moneta will know better.

Hi @Dongliang,

excellent, thank you very much for spotting this! Very good call :+1:

The tutorial is actually using the SPlots class wrong. It is calculating sWeights for the isolation based on the invariant mass and the isolation itself. That’s wrong. For sPlots, the control variable should not be in the set of discriminating variables for the likelihood fit! See https://arxiv.org/pdf/physics/0402083.pdf.

That means when using the SPlot class, you should exclude the isolation variable from the dataset that you pass. In the tutorial, this would look like that:

   RooRealVar *invMass = ws->var("invMass");

   // Now we use the SPlot class to add SWeights for the isolation variable to
   // our data set based on fitting the yields to the invariant mass variable
   std::unique_ptr<RooDataSet> dataInvMass{static_cast<RooDataSet*>(data->reduce(RooFit::SelectVars(*invMass)))};
   RooStats::SPlot *sData = new RooStats::SPlot("sData", "An SPlot", *dataInvMass, model, RooArgList(*zYield, *qcdYield));
   // Merge the sWeights that are now added to dataInvMass to the dataset in the workspace
   data->merge(dataInvMass.get());

Here is the fixed tutorial file:
rs301_splot.C (12.2 KB)
.

The plot now looks correct:

I will proceed fixing the tutorial in the ROOT repository:

Thanks again for reporting this!

Cheers,
Jonas

1 Like

Hi Jonas,
Thanks a lot for the investigation and the fix. You mentioned that “the control variable should not be in the set of discriminating variables for the likelihood fit”, so I’m trying to use only the invMass variable in the fit PDF instead of removing the isolation variable from the dataset. Will this give me correct results? The plots from my test script seem reasonable…(the test script
rs301_splot.C (12.0 KB)
and the output plot below)

Dongliang

Yes, that’s equivalent. And actually I like your approach better of changing the model instead of the data, because it doesn’t involve creating a new copy of the data. I do in the PR now the same as you.

Additional note: you can get rid of the model1.fitTo() call in your script, as it is redundant. The SPlot class does this internally.

Great! Thanks a lot for the confirmation and the helpful note.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.