Fitting Data with user defined function

Debajyoti_Sengupta · September 20, 2018, 3:33pm

ROOT Version: 6.14.04
Platform: Windows 10
Compiler: Not Provided

Hi,
I have a Histogram that I am trying to fit with a defined function. This is the code that I am using

TF1 fit("fit", "([0]*exp(-0.5*((x-[1]))^2) * [3]*exp(-[4]*x))",1,150);
      
      fit.SetParName(0,"amp_gauss");
      fit.SetParName(1, "Mean_Gaus");
      
      fit.SetParName(3, "amp_exp");
      fit.SetParName(4, "exp_arg");


      fit.SetParameter(0,60.0);
      fit.SetParameter(1, 90.0);
    
      fit.SetParameter(3, 220.0);
      fit.SetParameter(4, 34.0);
      h1.Fit("fit");

This doesn’t fit properly. The fit values are always bad and never change beyond the pre-specified values.

These are the obtained parameters

Info in <TCanvas::MakeDefCanvas>:  created default TCanvas with name c1
 FCN=5634 FROM HESSE     STATUS=FAILED         11 CALLS         261 TOTAL
                     EDM=0    STRATEGY= 1  ERROR MATRIX UNCERTAINTY 100.0 per cent
  EXT PARAMETER                APPROXIMATE        STEP         FIRST
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE
   1  amp_gauss   -2.12315e+06   1.41421e+00   0.00000e+00   0.00000e+00
   2  Mean_Gaus    9.00000e+01   1.41421e+00   0.00000e+00   0.00000e+00
   3  p2           0.00000e+00   1.41421e+00   0.00000e+00   0.00000e+00
   4  amp_exp      2.20000e+02   1.41421e+00   0.00000e+00   0.00000e+00
   5  exp_arg      3.40000e+01   1.41421e+00   0.00000e+00   0.00000e+00

Any idea what the issue is here?

Triple_S · September 20, 2018, 7:11pm

I guess what you’re trying to do is fitting a gaussian with an underlying exponential background.
The first thing I notice is that your gaussian part is missing the sigma parameter. The actual gaussian reads [0]*exp(-0.5*((x-[1])/[2])^2). With youre formula the sigma is fixed to 1, which when I look at the plot does not match your data.
Secondly the exponential has to be added to, not multiplied with the gaussian like in your equation.
I would also suggest that you take the first few bins out of the fit range, cause your data are dropping there which cannot be described by the exponental background and thereby can confuse the fitter.
Generally if a fit fails it is always a good idea to plot the function with the inital parameter. It should describe the data already a little bit otherwise the fit is likely to fail cause the optimum is too far away from the inital values.

In your case I’m pretty sure that fixing your function will fix your fit. The function I think you want ist ([0]*exp(-0.5*((x-[1])/[2])^2) + [3]*exp(-[4]*x))
By the way this is technically the same as gaus(0) + expo(3) using the predefined functions gaus and expo.
You can get more information on declaring eqiations and the predefined ones here:
https://root.cern.ch/doc/v612/classTF1.html

Debajyoti_Sengupta · September 21, 2018, 4:59am

Hi,
I had initially kept the sigma but I was multiplying the two functions and was thus getting erroneous results. That’s why I decided to mute the parameters one by one to see if anything changes. I gave up after removing the sigma term, and that’s when I posted here.

I tried adding the two functions and removing the first twenty bins and it’s giving a decent fit.

However, I wanted to ask you a somewhat unrelated question about the histogram that I am trying to fit. I’m generating the histogram from a tree from a root file. Everytime I run the code, the peaks of the exponential and the gaussian keep changing, which is why I had trouble fitting it in the first place. The changes aren’t too drastic (the exp peaks go up and down only by 10-20 units) but I don’t know why this is happening

Triple_S · September 21, 2018, 6:20am

This sounds indeed very strange. Could cou provide more information on the code you are using to generate the tree and to extract the histogram? Is there somewhere a random factor going into like in a simulation? Are you using a self written way to fill the histogram or do you use the TTree::Project method?

Generally if you have two data sets that should be identical it is always a good idea to look at the statistical errors. Therefore call ->Sumw2() on your histogram before it is being filled and then draw it using the “E” option. Do this for both histograms you want to compare and then plot them over each other. If qbout 2/3 (1 sigma range) of all point meet within their error bars, the differences are described by statistical fluctuations.

At your maximum count rate of ~ 200 the statistical fluctuations can be estimated by sqrt(200) ~ 14 so fluctuations of 10 - 20 can be purley statistical, but only if there is a random factor going in somewhere, because otherwise the program should be deterministic.

Debajyoti_Sengupta · September 21, 2018, 8:04am

@Triple_S

I was given a root file which included a tree.
The code I use to extract and generate the hist is as follows :

void zdecay::Loop()
{

   //* Creating a vector of objects of class particles.
   // TCanvas c1("c1","Histogram here",600,800);
   cout << "Begin" << endl;
   int verbose_level = 1;
   int need = 0;

   if (fChain == 0) return;  
   // TCanvas c1("c1","c1",800,1000);

   TH1F h1("h1","Histogram",150,0,600);
   // h1.FillRandom("gaus",250000);
   // h1.Draw();


   Long64_t nentries = fChain->GetEntriesFast();
   cout<<"Total entries "<<nentries<<endl<<endl;
   Long64_t nbytes = 0, nb = 0;

   for (Long64_t jentry=0; jentry<nentries;jentry++) {
      Long64_t ientry = LoadTree(jentry);
      //if (ientry < 0) break;
      nb = fChain->GetEntry(jentry);
      nbytes += nb;
      // cout << "processing event "<<jentry <<endl;
      vector<particles> electrons;
      // cout<<"event no "<<jentry<<" eta of first electron "<<nEle<<endl;
      
      
      // Acquisition loop. If the event has at least one electron (or particle) then it makes an electron object
      // which is inserted into the vector - electrons

      if(nEle>0) {
         for (int i=0 ;i<nEle; i++){
            particles electron(elePhi -> at(i),eleEta->at(i),elePt->at(i),eleCharge->at(i));
            electrons.push_back(electron);
         }
         std::sort(electrons.begin(),electrons.end(), compare_trans);

      // Contructing the four vector for the first two particles with highest Pt - Summing them in V3 for display.
      TLorentzVector v1(electrons[0].getPx(),electrons[0].getPy(),electrons[0].getPz(),electrons[0].getE());
      TLorentzVector v2(electrons[1].getPx(),electrons[1].getPy(),electrons[1].getPz(),electrons[1].getE());
      TLorentzVector v3 = v1+v2;
      // if (v3.M()>60.0 & v3.M() <140.0)
      h1.Fill(v3.M());

Where I have defined a class particles separately.

Triple_S · September 21, 2018, 8:17am

Well your code does not explain the fluctuations. I would say it is not really important because they are most probably covered by the statistical errors, but it’s definitly still a strange behavior. Maybe there is a bug in the sorting algorithem for the particles vector, but I can only guess.

Debajyoti_Sengupta · September 21, 2018, 8:26am

I am using a function called compare_trans for the sorting. It’s defined as follows

bool compare_trans(particles particle1, particles particle2){
    return particle1.getPt() > particle2.getPt();
   }

Wile_E_Coyote · September 23, 2018, 10:53am

Try with:

TLorentzVector v2; if (nEle > 1) v2.SetXYZT(electrons[1].getPx(),electrons[1].getPy(),electrons[1].getPz(),electrons[1].getE());

Also, you could “improve” something else a bit:

bool compare_trans(const particles &particle1, const particles &particle2) {

BTW. Looking at the plot in your first post here, it seems to me that you should define the range of your “fit” function from about 30 (instead of 1) to 150. Moreover, it seems to me that you should add the “gaussian” and the “exponential”, not multiply them. Also, do remember to use the “R” fitting option (otherwise the range of your “fit” function will not be taken in to account): h1.Fit("fit", "R");

You may also try something like this:

TF1 fit("fit", "[0]*exp(-0.5*pow((x-[1])/[2],2)) + [3]*exp(-[4]*x)", 0., 1000.);
fit.SetNpx(10000);
fit.SetParNames("amp_gauss", "Mean_Gaus", "sigma_Gaus", "amp_exp", "exp_arg");
fit.SetParameters(110.0, 90.0, 10.0, 220.0, 0.01);
// fit.Draw();

h1.Fit("fit", "", "", 30., 150.);

Debajyoti_Sengupta · September 23, 2018, 3:31pm

Hello @Wile_E_Coyote,

The issue of changing peaks has been resolved.
I changed

if(nEle>1) {
         for (int i=0 ;i<nEle; i++){
            particles electron(elePhi -> at(i),eleEta->at(i),elePt->at(i),eleCharge->at(i));
            electrons.push_back(electron);
         }
         std::sort(electrons.begin(),electrons.end(), compare_trans);

which was nELe > 0 before. How was this interfering though?

This results in an error.

In file included from input_line_10:1:
D:\debajyoti_project\zdecay.C:78:12: error: member function 'getPt' not viable: 'this' argument has type 'const particles', but function is not marked const
    return particle1.getPt() > particle2.getPt();
           ^~~~~~~~~
D:\debajyoti_project\zdecay.C:36:11: note: 'getPt' declared here
   double getPt(){
          ^
D:\debajyoti_project\zdecay.C:78:32: error: member function 'getPt' not viable: 'this' argument has type 'const particles', but function is not marked const
    return particle1.getPt() > particle2.getPt();
                               ^~~~~~~~~
D:\debajyoti_project\zdecay.C:36:11: note: 'getPt' declared here
   double getPt(){

which is why I think I opted for omitting const in the first place iirc.

Wile_E_Coyote:

You may also try something like this:

TF1 fit("fit", "[0]*exp(-0.5*pow((x-[1])/[2],2)) + [3]*exp(-[4]*x)", 0., 1000.);
fit.SetNpx(10000);
fit.SetParNames("amp_gauss", "Mean_Gaus", "sigma_Gaus", "amp_exp", "exp_arg");
fit.SetParameters(110.0, 90.0, 10.0, 220.0, 0.01);
// fit.Draw();

h1.Fit("fit", "", "", 30., 150.);

Thank you for this. I was writing ugly code really.

Wile_E_Coyote · September 23, 2018, 3:55pm

If you get problems with the “const” then remove it but leave the “reference”:

bool compare_trans(particles &particle1, particles &particle2) {

Requiring nEle>1 in the beginning will also solve your problem. My proposal was to leave nEle>0 and then set nonzero “v2” when there is the second electron present. This would add cases with one electron only.

Debajyoti_Sengupta · September 23, 2018, 4:00pm

Thanks @Wile_E_Coyote,

The Histogram just won’t show up during runtime. I always have to save it to a file and then open it with TBrowser in the interactive session. Any fix for this?

Wile_E_Coyote · September 23, 2018, 4:03pm

I assume it gets deleted when your macro finishes.
If you want to “preserve” it, you would need to create it (and probably your fit function, too) on the heap (i.e. using “new”).

Debajyoti_Sengupta · September 23, 2018, 4:08pm

Working as expected now, thanks a lot for your help
@Wile_E_Coyote and @Triple_S

Debajyoti_Sengupta · October 5, 2018, 10:35am

@Wile_E_Coyote @Triple_S

I tried to substitute the gaussian for a crystalball function with the following code

TF1 *fit = new TF1("fit", "crystalball(0) + expo(5))", 0., 1000.);
      fit -> SetNpx(10000);
      
      fit ->SetParNames("amp_cb", "Mean_cb", "sigma_cb","tail_cb","curve_tail_cb", "amp_exp", "exp_arg");
      fit ->SetParameters(110.0, 90.0, 10.0, 1.0, 1.0, 220.0, 0.01);
      // fit.Draw();

      h1->Fit("fit", "", "", 18., 150.);

However this gives me a bad fit.

 FCN=3287 FROM HESSE     STATUS=FAILED         11 CALLS        1538 TOTAL
                     EDM=0    STRATEGY= 1  ERROR MATRIX UNCERTAINTY 100.0 per cent
  EXT PARAMETER                APPROXIMATE        STEP         FIRST
  NO.   NAME      VALUE            ERROR          SIZE      DERIVATIVE
   1  amp_cb      -1.99865e+60   4.66690e+01   6.49572e+59   0.00000e+00
   2  Mean_cb      9.00000e+01   1.41421e+00  -0.00000e+00   0.00000e+00
   3  sigma_cb    -1.51776e+74   1.41421e+00  -1.51776e+74   0.00000e+00
   4  tail_cb      1.00000e+00   1.41421e+00  -0.00000e+00   0.00000e+00
   5  curve_tail_cb   6.02877e+00   1.41421e+00 -7.04316e-115   0.00000e+00
   6  amp_exp    -2.86314e+113   1.41421e+00 -2.86314e+113   0.00000e+00
   7  exp_arg    -5.19102e+114   1.41421e+00 -5.19102e+114   0.00000e+00

I understand that the first three parameters of the crystalball function is amplitude,mean and sigma. I am having trouble figuring out a good starting point for the 4th and 5th parameter.
Help.

Wile_E_Coyote · October 5, 2018, 11:10am

"crystalball" = "[Constant] * ROOT::Math::crystalball_function(x, [Alpha], [N], [Sigma], [Mean])"

Wikipedia → Crystal Ball function ;

ROOT::Math::crystalball_function ;

${ROOTSYS}/tutorials/math/CrystalBall.C ;
${ROOTSYS}/tutorials/fit/fitNormSum.C ;
${ROOTSYS}/tutorials/fit/fitNormSum.py

Debajyoti_Sengupta · October 5, 2018, 4:11pm

Hi, could you be a little more specific as to the changes I need to make in my code?

should I replace

TF1 *fit = new TF1("fit", "crystalball(0) + expo(5)", 0., 1000.);

with

TF1 *fit = new TF1("fit", "[Constant] * ROOT::Math::crystalball_function(x, [Alpha], [N], [Sigma], [Mean]) + [3]*exp(-[4]*x ", 0., 1000.);
fit ->SetParameters(constant,alpha,n,sigma,mean,3,4);

Wile_E_Coyote · October 5, 2018, 4:51pm

You can use it this way, if you want (just note the alphabetic ordering of parameters):

TF1 *fit = new TF1("fit", "[Constant_1] * ROOT::Math::crystalball_function(x, [Alpha], [N], [Sigma], [Mean]) + TMath::Exp([Constant_2] + [Slope] * x)", 0., 1000.);
fit->Print();
for (Int_t i = 0; i < fit->GetNpar(); i++) std::cout << i << " : " << fit->GetParName(i) << std::endl;

Compare it to:

TF1 *fit = new TF1("fit", "crystalball(0) + expo(5)", 0., 1000.);
fit->Print();
for (Int_t i = 0; i < fit->GetNpar(); i++) std::cout << i << " : " << fit->GetParName(i) << std::endl;

system · October 19, 2018, 4:51pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.