Minuit2 Changing only 1 out of four parameters in ChiSquared minimization

christianwos83 · May 24, 2021, 10:13am

Dear all,

I am using Minuit2 to find the best values of four parameters through chi-squared minimization. For some reason, Minuit2 seems to change only one of these parameters, namely params[3], regardless of what I set the parameters to be initially. I really don’t understand why this is happening, especially because I have used Minuit2 in the past and I have never encountered this issue before. Is this something that happens when Minuit cannot find a minimum? And is there a way around it?

Below is my code:

#include "Minuit2/FunctionMinimum.h"
#include "Minuit2/MnUserParameterState.h"
#include "Minuit2/MnPrint.h"
#include "Minuit2/MnMigrad.h"
#include "Minuit2/FCNBase.h"

#include <iostream>
#include <iomanip>
#include <string>
#include <vector>
#include <cmath>
#include <Eigen/Dense>

using namespace ROOT::Minuit2;
using namespace Eigen;

const double PI = 4*atan(1);
// Define minimizer class (inherits from FCNBase)
class Minimizer : public FCNBase {

private:

  const std::vector< std::vector<double> > &m_data;
  //  const std::vector<double> &m_mass;

public:

  Minimizer(const std::vector< std::vector<double> > &data) : m_data(data) {}

  ~Minimizer() {}
  // functions to find factorial and double factorial of given number - used in the Vqq matrix element
 //Omitted here (but they are in the actual code)

  double diagonalization(double n, double L, double p0, double p1, double p2, double p3){
    // Declare the number of terms in the expansion and the number of iterations for the numerical integrals
    int N1 = 12;

    // Declare the variational parameters - values taken from the hybrid decay paper: https://arxiv.org/abs/2005.10850
    double alphaInit = 0.005;
    // Define the matrices - the dimension can be set to be any of the above indexes since they are generally the same
    MatrixXd HMX(N1, N1);
    MatrixXd NMX(N1, N1);
    MatrixXd KMX(N1, N1);
    MatrixXd VMX(N1, N1);

    //Declare the variables for the minimum eigenvalue
    double minEvalue0 = 0.0; // ground state eigenvalue
    double minEvalue1 = 0.0; // first excited state eigenvalue
    double minEvalue2 = 0.0; // second excited state eigenvalue
    double minEvalue3 = 0.0; // third excited state eigenvalue

    std::vector<double> eValues0;
    std::vector<double> eValues1;
    std::vector<double> eValues2;
    std::vector<double> eValues3;
    std::vector<double> minEvalues;

    // Loop over the variational parameters and the matrix elements (k1=n', l1=m', k2=n, l2=m).
    for (double alpha = alphaInit; alpha <= 1.000; alpha += 0.005){
      for (int k1 = 1; k1 <= N1; k1++){
        for (int k2 = 1; k2 <= N1; k2++){
          // Define the full Hamiltonian in configuration space - works for all values of L
          // Overlap Matrix
          NMX(k1-1,k2-1) = pow(2.0,L + 1)*sqrt(2.0)*pow(k1*k2,0.75 + L/2.0)/(pow(k1+k2,1.5 + L));
          // Mass and kinetic terms
          KMX(k1-1,k2-1) = (pow(2.0,L + 1)*sqrt(2.0)*pow(k1*k2,0.75 + L/2.0)/(pow(k1+k2,1.5 + L))) * (2.0*p0 + (3.0+2.0*L)*(k1*k2)*alpha*alpha/(p0*(k1+k2)));
          // Potential term
          VMX(k1-1,k2-1) = (powerfactorial(L)*pow(k1*k2,0.75 + L/2.0)/(doublefactorial(2*L+1)*sqrt(PI)*alpha*pow(k1+k2,2.0+L)))*(2*p1*alpha*alpha*(k1+k2)-3*(L+1)*p2);
        }
      }
      // Diagonalize and print the eigenvalues
      HMX = KMX - (4.0/3.0)*VMX;

      std::setprecision(8);
      GeneralizedSelfAdjointEigenSolver<MatrixXd> es(HMX,NMX);
      //pushback eigenvalues into vectors
      eValues0.push_back(es.eigenvalues()[0]);
      eValues1.push_back(es.eigenvalues()[1]);
      eValues2.push_back(es.eigenvalues()[2]);
      eValues3.push_back(es.eigenvalues()[3]);

    }
    // Calculate the minimum eigenvalue and the corresponding parameters
    minEvalue0 = eValues0[0];
    minEvalue1 = eValues1[0];
    minEvalue2 = eValues2[0];
    minEvalue3 = eValues3[0];

    //cout << "\n";
    //cout << "Here is the full list of eigenvalues: " << std::endl;

    for(unsigned int i = 0; i < eValues0.size(); i++){
      if (eValues0[i] < minEvalue0){
        minEvalue0 = eValues0[i];
      }
    }
    for(unsigned int i = 0; i < eValues1.size(); i++){
      if (eValues1[i] < minEvalue1){
        minEvalue1 = eValues1[i];
      }
    }

    for(unsigned int i = 0; i < eValues2.size(); i++){
      if (eValues2[i] < minEvalue2){
        minEvalue2 = eValues2[i];
      }
    }

    for(unsigned int i = 0; i < eValues3.size(); i++){
      if (eValues3[i] < minEvalue3){
        minEvalue3 = eValues3[i];
      }
    }
    //push_back all the minimized eigenvalues into a vector
    minEvalues.push_back(minEvalue0);
    minEvalues.push_back(minEvalue1);
    minEvalues.push_back(minEvalue2);
    minEvalues.push_back(minEvalue3);

    double eigenvalue = minEvalues[n-1]-p3;//return the desired eigenvalue from minEvalues
    return eigenvalue;
  }

  // Define function that selects the state for diagonalization(n, L, p0, p1, p2, p3)
  double mass (int state, double p0, double p1, double p2, double p3){
    double n;
    double L;
    double mesonMass = 0.0;

    if (state == 0){//1S
        n = 1.0;
        L = 0.0;
    }
    else if (state == 1){//1P
        n = 1.0;
        L = 1.0;
    }
    else if (state == 2){//2S
        n = 2.0;
        L = 0.0;
    }
    else if (state == 3){//1D
        n = 1.0;
        L = 2.0;
 }
    else if (state == 4){//2P
        n = 2.0;
        L = 1.0;
    }
    else if (state == 5){//3S
        n = 3.0;
        L = 0.0;
    }
    else if (state == 6){//4S
        n = 4.0;
        L = 0.0;
    }
    mesonMass = diagonalization(n, L, p0, p1, p2, p3);
    return mesonMass;
  }

  double operator() (const std::vector<double> &params) const {
    double X2 = 0.0;
    //std::cout << mass(0,params[0], params[1], params[2], params[3]) << std::endl;
    //std::vector<double> mass;
    for (int i=0; i<m_data[0].size(); i++) {
      double n = m_data[0][i];
      double L = m_data[1][i];
      double f = mass(i, params[0], params[1], params[2], params[3]);
      X2 += (m_data[2][i] - f)*(m_data[2][i] - f)/(m_data[2][i]);
      //std::cout << "data[2][i] = " << m_data[2][i] << std::endl;
      //std::cout << "mass[i]" << f << std::endl;
    }
    return X2;
  }
  //set to 1 for chi-sq and 0.5 for log-likelihood
  double Up() const {return 1.0;}
};

int main(){

  std::vector< std::vector<double> > data(3);
  // (1S) n=1, L=0; N=0
  data[0].push_back(1.0);
  data[1].push_back(0.0);
  data[2].push_back(9.4449);
  // (1P) n=2, L=1; N=0
  data[0].push_back(1.0);
  data[1].push_back(1.0);
  data[2].push_back(9.8999);
  // (2S) n=2, L=0; N=1
  data[0].push_back(2.0);
  data[1].push_back(0.0);
  data[2].push_back(10.0173);
  // (1D) n=3, L=2; N=0
  data[0].push_back(1.0);
  data[1].push_back(2.0);
  data[2].push_back(10.1637);
  // (2P) n=3, L=1; N=1
  data[0].push_back(2.0);
  data[1].push_back(1.0);
  data[2].push_back(10.2602);
  // (3S) n=3, L=0; N=2
  data[0].push_back(3.0);
  data[1].push_back(0.0);
  data[2].push_back(10.3552);
  // (4S) n=4, L=0; N=3
  data[0].push_back(4.0);
  data[1].push_back(0.0);
  data[2].push_back(10.5794);

  std::vector<double> params(4,1.0);
  std::vector<double> err(4,0.1);
  params[0] = 5.00; //mc
  params[1] = 0.40;  //alphaS
  params[2] = 0.14; //b
  params[3] = 0.18; //mass shift
  std::cout << "Initial values of parameters: " << std::endl;
  for (int i = 0; i < 4; i++){
    std::cout << "params[" << i << "] = " << params[i] << std::endl;
  }
  std::cout << "Computing the values of the parameters..." << std::endl;

  //minimize
  Minimizer fcn(data);
  
  MnMigrad migrad(fcn, params, err);
  FunctionMinimum minParams = migrad();
 std::cout << "CHISQUARED MINIMIZATION RESULTS:" << std::endl;
  std::cout << "\n";
  std::cout << "***************************************" << std::endl;
  std::cout << " chi^2 = " << minParams.Fval() << std::endl;
  std::cout << " for the following values:\n " << "mass q: " << minParams.UserState().Value(0) << ", alphaS: " << minParams.UserState().Value(1) << ", b: " \
            << minParams.UserState().Value(2) << ", shift: " << minParams.UserState().Value(3) << std::endl;
  std::cout <<"****************************************" << std::endl;
  std::cout << "\n";
  //std::cout << "Did the minimizer find a minimum?" << minParams.isValid() << std::endl;
  std::cout << "migrad() results: " << minParams << std::endl;
  std::cout << "Process completed." << std::endl;
  return 0;
}

The output of this is the following:

CHISQUARED MINIMIZATION RESULTS:

***************************************
 chi^2 = 0.00278467
 for the following values:
 mass q: 5.1997, alphaS: 0.4, b: 0.14, shift: 0.637963
****************************************

migrad() results:
WARNING: Minuit did not converge.

# of function calls: 124
minimum function Value: 0.002784672250699
minimum edm: 6.678767958345
minimum internal state vector: LAVector parameters:
      5.199699062206
     0.3999996341642
     0.1399999440163
     0.6379634861636

minimum internal covariance matrix: LASymMatrix parameters:
  1.4000616e-10  4.9897517e-11  1.0394972e-11 -6.9624951e-06
  4.9897517e-11  1.9757816e-11  3.9417031e-12 -2.6078556e-06
  1.0394972e-11  3.9417031e-12   8.394459e-13 -5.4326561e-07
 -6.9624951e-06 -2.6078556e-06 -5.4326561e-07     0.59767527


# ext. ||   Name    ||   type  ||     Value     ||  Error +/-

   0   ||        p0 ||  free   ||      5.199699062206 ||1.18324199021e-05
   1   ||        p1 ||  free   ||     0.3999996341642 ||4.444976434746e-06
   2   ||        p2 ||  free   ||     0.1399999440163 ||9.162128011203e-07
   3   ||        p3 ||  free   ||     0.6379634861636 ||0.773094606389



WARNING: FunctionMinimum is invalid:
         Edm is above max


Process completed.

As you can see, only params[3] is changed in the process, and this is true regardless of what initial values I give to the parameters. They remain unchanged, even if the end result is obviously off from the data. When I ran the code for a different set of data, Minuit was converging rather quickly, giving me the optimized values of parameters.

Any help/suggestion would be appreciated.

Thanks.

Please read tips for efficient and successful posting and posting code

ROOT Version: Not Provided
Platform: Linux Mint 17
_Compiler:_gcc

moneta · May 25, 2021, 12:20pm

Hi,

What Minuit2 does looks correct to me, giving your function. I see that your Covariance matrix has only the (3,3) element significantly different than zero. Since the step size, s, is equals to s = - C * g
where C is the covariance matrix (inverse Hessian) and g is the function gradient, the result is a step only in par[3].
Why this is happening you have to see how your chi2 function is implemented. If it is chi2 it’s minimum value should be around to the number of degrees of freedom of your problem.

Best regards

Lorenzo

christianwos83 · May 26, 2021, 6:59am

Hi @moneta,

and thank you for your reply. What is strange to me is that when I ran the same exact code for a different set of data (and different values for only par[0] and par[3], keeping par[1] and par[2] the same) Minuit converges and varies all four parameters. Could Minuit have trouble because I am trying to fit 4 parameters with only seven data points?

Thanks again.

Christian

moneta · May 26, 2021, 11:16am

In principle 7 datapoint and 4 parameters are still ok (you have 3 degree of freedom). I will try to plot your chi2 as function of the parameters , for example chi2((par[0],par[3]) and chi2( par[0],par[1])

Lorenzo