Strange behaviour of 2dimensional ROOT fit

giacbold · June 21, 2019, 9:10am

Hello everyone.
This might be a stupid question but i can’t find the answer myself.
I’m asked to generate some montecarlo data according to a distribution in two angles (theta = t, phi = p). This distribution has 3 parameters known (a = 0.65, b = 0.06, g = -0.18). The resulting histogram should then be fitted with maximum likelihood method and chi-squared. The problem is that the parameters retrieved from the fit are far from the true ones (even when genereting something like 10^9 events). Still if i plot the original function (with true parameters), the function and the histogram of the montecarlo events overlap perfectly as i expect to happen. Is there something i’m missing in the fitting procedure that could lead to this situation?
Thanks everyone, the simple code is the following:

#include <iostream>
#include "TF2.h"
#include "TH2F.h"
#include "TCanvas.h"
#include "TRandom3.h"
#include <cmath>

struct estimates{ //struct to contain result of fit
    double a_est;
    double b_est;
    double g_est;
};

double to_fit(double *x, double *par){ //to be passed to the fitting routine
    
    return (3./(4.*M_PI))*(0.5*(1.-par[0]) + (0.5)*(3.*par[0]-1)*cos(x[0])*cos(x[0]) - par[1]*sin(x[0])*sin(x[0])*cos(2.*x[1])- sqrt(2.)*par[2]*sin(2.*x[0])*cos(x[1]));
}

double evaluate(double t, double p){ // this evaluates the distribution at theta and phi

    double a = 0.65;
    double b = 0.06;
    double g = -0.18;

    return (3./(4.*M_PI))*(0.5*(1.-a) + (0.5)*(3.*a-1)*cos(t)*cos(t) - b*sin(t)*sin(t)*cos(2.*p)- sqrt(2.)*g*sin(2.*t)*cos(p));
}

estimates monte_carlo(int measures, int seed ){
    
    std::vector<double> theta_vec;
    std::vector<double> phi_vec;

    TRandom3* gen = new TRandom3(seed);
    TH2F* h = new TH2F("h", "h", 300, 0, 2*M_PI, 300, 0, 2*M_PI);

    for(int i = 0; i < measures; i++){ //rejection method to build distribution

        bool it = true; 

        while(it){

            double theta = gen->Uniform(0,2*M_PI);
            double phi = gen->Uniform(0, 2*M_PI);
            double y = gen->Uniform(0,1);

            if(y < evaluate(theta, phi)){
                theta_vec.push_back(theta);
                phi_vec.push_back(phi);
                h->Fill(theta, phi);
                it = false;
            }
        }
    }

    const int npar = 3;
    double dist_params[npar] = {0.65, 0.06, -0.18};

    TF2* fit = new TF2("fit", to_fit, 0, 2*M_PI, 0, 2*M_PI, npar);
    fit->SetParameters(dist_params);
    fit->SetParNames("#alpha", "#beta", "#gamma");

    
    h->Fit(fit, "LL");
    estimates est;
    est.a_est = fit->GetParameter(0);
    est.b_est = fit->GetParameter(1);
    est.g_est = fit->GetParameter(2);

    //TCanvas* c = new TCanvas("c", "c", 1000,1000, 1000,800);
    //h->Draw("colz");
    //fit->Draw("same cont1");
    //c->Draw();
    //c->SaveAs("./miao.png", "png");

    return est;
}

int main(){

    srand(time(NULL));

    int experiments = 100;
    int measures = 50000;

    //this is to check CLT on parameters.
    
    TH1F* h_a = new TH1F("h_a", "h_a", 10, 0.75, 0.92);
    TH1F* h_b = new TH1F("h_b", "h_b", 10, 0, 0.3);
    TH1F* h_g = new TH1F("h_g", "h_g", 10, -0.2, 0);

    for(int i = 0; i < experiments; i++){
        if(i%10 == 0 ){
            std::cout << i << "\n";
        }
        int seed = rand();
        estimates est =  monte_carlo(measures, seed );
        h_a->Fill(est.a_est);
        h_b->Fill(est.b_est);
        h_g->Fill(est.g_est);

    }

    TCanvas*c = new TCanvas("c", "c", 1000,1000, 1000, 800);
    h_a->Draw("hist");
    c->Draw();
    c->SaveAs("./p.png", "png");
    
    

    return 0;;
    
}

Wile_E_Coyote · June 21, 2019, 11:20am

Your “fit” / “to_fit” functions are not able to describe your histogram.
Try something like this (though I’m not sure if it’s sufficient):

// ...
double to_fit(double *x, double *par){ //to be passed to the fitting routine
  return par[3]*(3./(4.*M_PI))*(0.5*(1.-par[0]) + (0.5)*(3.*par[0]-1)*cos(x[0])*cos(x[0]) - par[1]*sin(x[0])*sin(x[0])*cos(2.*x[1])- sqrt(2.)*par[2]*sin(2.*x[0])*cos(x[1]));
}
// ...
    // ...
    const int npar = 4;
    double dist_params[npar] = {0.65, 0.06, -0.18, 1.};
    dist_params[3] = h->Integral("width") / 4.; // "scaling" estimation
    // ...
    fit->SetParNames("#alpha", "#beta", "#gamma", "scaling");
    // ...

giacbold · June 22, 2019, 4:26pm

Oh yeah. That worked. This was a pretty stupid question, height of the distribution depends on the number of the sample monte carlo produced and in general does not match with the height of the original distribution from which we draw events. Thank you very much!