DCB fit for a BSM dijet signal

Hello experts,

I am working on an NMSSM analysis where a Y Higgs (90-800 GeV) is fitted using a double-sided crystal ball function. For low masses, this function works fine but as I tend to higher masses, the fits are not that perfect. The parameter range and function I have used (for Y=800 GeV):

Mjj_sig_m0_cat0[740, 700, 800];
Mjj_sig_sigma_cat0[10.0, 1.0, 60.0];
Mjj_sig_alpha1_cat0[1.0, 0.05, 15.0];
Mjj_sig_n1_cat0[2.0, 0.1, 15.0];
Mjj_sig_alpha2_cat0[1.0, 0.05, 10.0];
Mjj_sig_n2_cat0[2.0, 0.1, 10.0];

MjjSig_cat0 = RooDoubleCB(Mjj, Mjj_sig_m0_cat0, Mjj_sig_sigma_cat0, Mjj_sig_alpha1_cat0, Mjj_sig_n1_cat0, Mjj_sig_alpha2_cat0, Mjj_sig_n2_cat0);
fit_NMSSMX1000ToY800H125_2016_AllYears_DoubleHTag_0.pdf (17.8 KB) fit_NMSSMX1000ToY800H125_2016_AllYears_DoubleHTag_1.pdf (20.7 KB) fit_NMSSMX1000ToY800H125_2016_AllYears_DoubleHTag_2.pdf (20.7 KB)

I am attaching plots that show the signal fits of different category. Could you please help me to improve the current fits? Currently, they are not that perfect.

Thanks,

Lata

Hi @lata,

in particular when I look at your DoubleHTag_0.pdf plot, it seems to me that the crystal ball shape is not appropriate to model your simulation. The crystall ball shape has a Gaussian core, and in your simulation there is a sort of double-peak structure that can’t be fitted by the crystal ball shape. And in DoubleHTag_2.pdf, one can observe a kind of “shoulder” on the left side that can’t be modeled with the crystal ball either.

I think the problem here is not your fit starting point, but that the crystal ball is not flexible enough to model your shape. Maybe you can try to fit all your categories together to get a lower stat uncertainty, and see if the crystal ball shape is really appropriate? Perhaps it’s necessary to add a second Gaussian component to model the shoulder on the left side or double-peak structure in the center. Or maybe you will find that the double-peak was just a statistical fluctuation that goes away when merging all categories.

Once you found a good function for the whole dataset you can go ahead and use it to fit your categories. That’s at least how I would proceed.

I hope this is useful to you, let me know if you have further questions!

Cheers,
Jonas

1 Like

Coincidentally, there was just another question in the forum where the user did something similar to what I suggested here: fitting a crystal ball plus a Gaussian. Maybe it’s useful for you to follow that thread.

Thanks, @jonas! I will try with your suggestions and let you know how my signal fit looks.

Hello @jonas!
Sorry for my late reply.
As you suggested I tried with a CB+Gaus fit. I am attaching my notebook and workspace file at [1]. I have written a small code snippet to test CB+gaus function for signal modelling. This is the first time I am trying to use the convolution of two signal function in roofit. First I tried to fit only CB function and then tried to fit CB+Gauss function. But later one doesn’t seem to work properly.

Could you please have a look?

Thanks,

Lata

[1] ROOFIT – Google Drive

Hi @lata,

thanks for sharing the notebook!

The convolution is not working because of the parameter boundaries you chose. You chose both the CB and the Gaussian to have a mean around 700, but if you really intend to to a convolution the Gaussian is expected to have a mean around zero. Otherwise, your convolution will have a peak that is way off to the right around 1400, which is why you get this rising function that doesn’t fit the data as a fit result.

But in any case, I was referring to a convolution of a CB with a Gaussian, but adding the Gaussian to the CB with RooAddPdf to fit the double peak/shoulder structure. The convolution of a CB with a Gaussian is actually not very meaningful, because the CB has a Gaussian core and the convolution of a Gaussian with a Gaussian is also a Gaussian. So with your convolution shape, you get a slightly different shape for the tails but that’s not what you need I think.

I hope this helps, let me know if the fit of the CB + Gaussian works out!

In any case, don’t hesitate to ask further questions,
Jonas

Hello @jonas ! you mean to say I should try with
gausscball = r.RooAddPdf(“gaussPcball”,“Gauss(+)cball”,r.RooArgList(cball,gauss))

with gaussian parameters
mean = r.RooRealVar(“mean”,“mean of gaussian”,0.1, 0, 5)
sigma = r.RooRealVar(“sigma”,“width of gaussian”,1,1.0,10)

Am I right?

Hi!

No, leave the Gaussian parameters as you had in the notebook and only change the convolution class to RooAddPdf. And keep in mind that RooAddPdf takes an additional parameter list with the coefficients. Giving one coefficient is enough, the other will automatically be set to 1 - frac:

frac = r.RooRealVar("frac", "frac", 0.5, 0.0, 1.0)
gausscball = r.RooAddPdf("gaussPcball","Gauss(+)cball",
                         r.RooArgList(cball,gauss),
                         r.RooArgList(frac))

The thing with changing the parameters of the Gaussian was only to explain why the convolution didn’t work as you tried to do it :slight_smile:

Have a nice day and cheers!
Jonas

Hello @jonas !

Thanks, it seems to work fine. If you open the same notebook, you can find the plots with CB+Gauss fit. Let me know if you think I need to tune the parameters more.

Moreover, I have added one more notebook with name “CB_Gaus_fit-Copy1.ipynb” in the same gdrive where I try to model signal by adding data from add 2016, 2017, 2018 simulations. This is just to see if the same signal model works fine for the fits with more stats. What I notice that with CB fit tail region fit is good while with CB+Gauss fit peak region fit is good.

If I try to tune parameters more, it is making my fit worse :frowning: . Do you know some way to fix this?

Thanks,

Lata

Hi @lata!

Thanks for uploading the notebook and the data, this makes it much easier to see what’s going on!

Good to see that the Gaussian+CB fit works well. I found two problems in the notebook that might be the cause of your fit getting worse:

  1. The parameters mean and sigma for your Gaussian are too restrictive. The fit results for these parameters were very close to the variable definition. The fit gets much better when you define larger ranges, for example:
mean = r.RooRealVar("mean","mean of gaussian",761, 600, 900)
sigma = r.RooRealVar("sigma","width of gaussian",100,1.0,200)
  1. You fit two models one after another (first cball and then gausscball), and then you are plotting them one after another. The problem is that the models share some parameters, so the second fit will overwrite the best fit values of the first fit. The first plot (on the left) will then be wrong. You should best plot the model just after fitting it so you don’t run into problems like this (see my jupyter notebook).

Here is the notebook where I changed these two things so you can see what I mean:
CB_Gaus_fit-Copy1-jonasedit.py (40.3 KB)

The forum didn’t let me upload files with the ipynb suffix. Please rename the file to CB_Gaus_fit-Copy1-jonasedit.ipynb before opening the notebook.

With only the CB, I get a reduced chi-square of 22.55, and with Gauss+CB I get 6.28, so it’s already much better! The tails looks perfect, but maybe you will still have to do something about the peak (depending on the physics behind your distribution).

I hope this helped with your fit, please let me know if you have further questions!
Jonas

thanks, @jonas for providing all the help and suggestions. It seems to work fine now. :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.