Different parameter values for fit with ROOT and Roofit

Emanuele94 · December 6, 2020, 4:29pm

Good evening everyone,
I performed the fit of a histogram with both ROOT and RooFit.
The function I am analyzing is the Voigt (i.e. the convolution of Gaussian and Breit-Wigner).
I noticed that with the two methods the parameters I declare are different:

With ROOT, I have the normalization, the sigma of the Gaussian, the range of the Breit-Wigner and the average value;
With RooFit, on the other hand, the mean value of the BW (which must be normalized to zero, from what I have seen), the mean value of the Gaussian, the sigma and the gamma.

The result of the first average is:

Mean = 493,598 (13)

while that of the second is:

Mean = 493.649 (209)

that is an error 10 times more important than the first.
Although the choice of bins and the integration interval are the same, it would seem that one of the two modes allows the assignment of a more precise average value.
I would like to know if it is due to the two different ways in which parameters are assigned.
Thanks in advance.

moneta · December 8, 2020, 10:35am

Hi,

I guess you are doing for both a binned likelihood fit. The observed difference is in the difference in the way the likelihood is computed in the two cases and a difference results from the fact that the approximation of using the bin center is used. In case of RooFit one observed a larger bias due to the bin centers and is reported in https://sft.its.cern.ch/jira/browse/ROOT-3874
This problem is actually solved when using the new class, RooBinSamplingPdf

Lorenzo

Emanuele94 · December 8, 2020, 2:45pm

Thanks @moneta for the answer.
I read the documentation and tried to use the construct “IntegrateBins”:

#include <RooCmdArg.h>
...

 signal.fitTo(dh, IntegrateBins(0.));
 signal.plotOn(frame,DrawOption("L"));
 frame->Draw();

...

The error message that appears to me is:

 error: use of undeclared identifier 'IntegrateBins'

Could it be due to the ROOT version I use?
I use v6.22

moneta · December 8, 2020, 2:47pm

Yes, you would need to use ROOT master for this. It is a new functionality introduced only one month ago.

Emanuele94 · December 8, 2020, 3:11pm

Excuse @moneta the pedantry but can you confirm that the version I have to download is the one released on December 1st 2020 on the site -> Release 6.22 / 06?

moneta · December 8, 2020, 3:35pm

I am sorry, IntegrateBins is not available in 6.22, but in 6.23. You will have to build it from source.
But is this really needed for you ? Increasing the number of bins, if you can, can also help

Lorenzo

Emanuele94 · December 8, 2020, 3:43pm

Sorry @moneta ,
from the report you attached it says that you can use “RooBinSampling” in two ways: one of them is with the simple IntegrateBin (Double_t precision).
From what I understand even using RooBinSampling directly I need an updated ROOT version (if I’m not mistaken, v6.24 - so says the topic you posted to me).
In any case, therefore, I have to update ROOT if I want to solve the problem and be sure that the fit parameters give me reliable values in output.
Now are you telling me that by increasing the binning, the results improve?

moneta · December 8, 2020, 4:02pm

Hi,
What I am saying is the bias of not integrating in the bin is normally rather small and it depends on the model. This bias often goes away or it is even smaller when increasing the number of bins.
Have you tried doing this ?

Lorenzo

moneta · December 8, 2020, 4:04pm

Also reading a second time your problem, I am not sure I understood what is the difference. Is it in the Mean value of the Gaussian and the resulting error on the parameter Mean ?

Emanuele94 · December 8, 2020, 6:57pm

Yes it is.
What I was wondering is why, although the conditions were the same (number of bins and integration interval), the values of the mean were different.
Although they are compatible, one of the two (the one with Roofit) has an error 10 times greater than the other.

moneta · December 9, 2020, 7:49am

I would be interesting to see why the error is 10 larger. Can you then post your code ?
Thanks
Lorenzo

Emanuele94 · December 9, 2020, 10:50am

Sure!

ROOT_CODE

//Here I define the function (I can use also the "Voigt" pdf ... It's the same) 
Double_t gauland(Double_t *x,Double_t *par) {

     Double_t risq2pi =  0.3989422804014;
     Int_t nstep = 10000;
     Double_t sc = 5.0;    
     Double_t xlow, xupp,step;
     Double_t sum;
     Double_t sigma,km,pga,norm;
 
     sum = 0; sigma=par[0]; km=par[1]; pga=par[2]; norm=par[3]; 
 
     xlow = x[0] - sc*sigma;
     xupp = x[0] + sc*sigma;
     step = (xupp-xlow)/(nstep*2.);

     for (Int_t ij=1;ij<=nstep;++ij){

      Double_t xxl = xlow + (ij-0.5)*step;
      Double_t xxc = xupp - (ij-0.5)*step;

      Double_t smear = exp(-(x[0]-xxl)*(x[0]-xxl)*0.5/(sigma*sigma))*risq2pi/sigma;  
      Double_t landa = pga*risq2pi*risq2pi/((xxl-km)*(xxl-km)+0.25*pga*pga);   
      sum += smear*landa;

      smear = exp(-(x[0]-xxc)*(x[0]-xxc)*0.5/(sigma*sigma))*risq2pi/sigma; 
      landa = pga*risq2pi*risq2pi/((xxc-km)*(xxc-km)+0.25*pga*pga);    
      sum += smear*landa;

     }

     Double_t fitval =  norm*step*sum;
     return fitval;
}
void ROOT_mass() {
  
  TChain chain("h1");
  ....//Add ".root" file to the chain 

  TH1D *hma2 = new TH1D("hma2","Mass", 200, 475., 512.);

  chain.Draw("Minvariant[0]>>hma2");
 
  TF1 *fgauland = new TF1("gauland", gauland, 475., 512.,4);
  fgauland->SetParameters(2., 493.677, 1., 1000.);
  fgauland->SetParNames ("Sigma","Mean","Pga","Norm");
 
  
  TCanvas *c3 = new TCanvas("c3", "c3");
  
  hma2->Fit("gauland","RVE","", 483., 503.);
  hma2->Draw();
  gStyle->SetOptFit(111111);
  hma2->Draw("samehisterror");
  Double_t binwidth= 37./200.; 
  Double_t ierr;
  Double_t fcha = (483. - 475.)/binwidth + 1.;
  Double_t lcha = (503. - 475.)/binwidth + 1.; 
  Int_t ndof; 
  Double_t histen = hma2->IntegralAndError(fcha,lcha,ierr);
}

and this is the second one:

ROOFIT_CODE

void RooFit_Mass() {
  
  TChain chain("h1");
  ...//Add ".root" file to the chain 

  TH1D *hma1 = new TH1D("hma1","M_invariant for 2-body decay", 200, 475., 512.);

  chain.Draw("Minvariant[0]>>hma1");
  
  RooRealVar x("x", "x", 475., 512.);
  RooDataHist dh("dh", "dh", x, Import(*hma1));

  RooPlot *frame = x.frame(Title("M_invariant for 2-body decay"));
  dh.plotOn(frame);

  RooRealVar mean("mean", "mean", 493.677, 490, 500);
  RooRealVar sg("sg", "sg", 1, 0, 10);
  RooGaussian gauss("gauss", "gauss", x, mean, sg);
 
  RooRealVar mean1("mean1", "mean1", 0, -5, 5);
  RooRealVar sbw("sbw", "sbw", 2, 0, 10); 
  RooBreitWigner bw("bw", "bw", x, mean1, sbw);
 
  RooFFTConvPdf signal("signal","landau (X) gauss", x, gauss, bw) ;

  signal.fitTo(dh, Range(483.,503.));
  signal.plotOn(frame,DrawOption("L"));
  frame->Draw();
}

moneta · December 9, 2020, 11:07am

Hi,

Thank you for the code. Can you also post please the input histogram in a ROOT file? Not needed to have the full Three used to generate the histogram if it is big

Emanuele94 · December 9, 2020, 11:40am

Sure!
Histogram: histogram.root (8.9 KB)
Programs: RooFit_MASS.C (1.9 KB) ROOT_MASS.C (3.0 KB)
“histogram.root” is histogram of the “Minvariant[0]” with the cuts already made!

As you will notice the sigmas are equivalent (both in value and in error). The average values, however, are not.
That’s why I thought it depends on the different definition of the two parameters …

moneta · December 10, 2020, 8:43pm

Hi

Thanks for the macros. I could reproduce the difference, but these are 2 different functions, one a numerical convolution and the other a FFT. Are you sure the function parameters are the same also ?
It is not a different result because using different tools, but using different methods

Lorenzo

Emanuele94 · December 11, 2020, 9:57am

Thanks @moneta ,
but if the difference were due to the fact that one is a FFT and the other a convolution, I would have had these huge differences also between the program ROOT where it is defined as FFT and the ROOT one where it is defined like this:

TF1 *fgauland = new TF1("gauland", "[0]*TMath::Voigt(x - [1], [2], [3], 4)", 475., 512.);
 fgauland->SetParameters(1000.,493.677, 2., 1.);
 fgauland->SetParNames ("Norm","Mean","Sigma","Pga");

something that, however, I have not found (always with the same parameters) - the output values are different but of the same order.
That’s why I was wondering if the difference was due to the fact that the parameters are used differently (I explain these differences in the main topic).

moneta · December 11, 2020, 11:03am

Hi,
You are saying that if you are using a Voigt function in ROOT as above and the RooFit FFT you are getting similar results ?
Then the question is why using your other function and the Voigt you have differences. I cannot understand from the code which function you are implementing there. And even if mathematically what you are doing is a Voigt function, one needs to see if it is implemented correctly

Lorenzo

Emanuele94 · December 11, 2020, 11:53am

Sorry I misinterpreted your message!
Thanks for the replies

moneta · December 11, 2020, 1:30pm

The difference is really caused by using the FFT. If you are using in the RooFit example the RooVoigtian pdf you get exactly the same result as in the ROOT example.
Increasing the FFT binning by doing x.setBins(100000, "cache"); is not helpful.

moneta · December 11, 2020, 1:58pm

I found the problem in your model in RooFit. When doing a convolution fit, you need to set the mean of the Gaussian function to zero and constant, as done in the tutorial rf208_convolution.C.
After this change you get same uncertainty in the parameters.
Here is the correct macro: RooFit_MASS.C (2.3 KB)

Ciao

Lorenzo