Roofit: Vavilov and gaussian convolution

florian_Gautier · February 25, 2020, 7:54pm

Hi,
I am quite new to roofit and Root in general. I am trying to fit some energy depostion and compare the best case between models. I require to test the best fit between the convolution of a landau & Gaussian or a Vavilov & Gaussian and the data.

I managed to do the fitting with the Landau (X) Gaussian but I am stuck at the Vavilov (X) Gaussian.

I tried to follow Vavilov - Gaus Convolution with RooFit topic which allowed me to get VavilovGaussian.pdf (54.2 KB)
However, the Math::Vavilov is not adapted i think to my case due to it lacking the ability to be shifted and working on non normalized data.
I had this other Vavilov implementation which allows me to not normalize my data:

struct Vavilov_Func { 
   Vavilov_Func() {}

   double operator() (const double *x, const double *p) { 
      double kappa = p[0]; 
      double beta2 = p[1];
      return p[4]*( pdf.Pdf( (x[0]-p[2])/p[3], kappa,beta2) );
   }

   ROOT::Math::VavilovAccurate pdf; 
};

While I knew how to use it to get a TF1 I have no idea how to use this in Roofit to get a RooAbsPdf to then proceed with convolution.
Here is my codeTest.C (6.2 KB)

I would appreciate any help on this task!

Thanks in advance,
Florian

StephanH · February 26, 2020, 12:36pm

Hello Florian,

I don’t understand all your problems, yet, but shifting should be easy. Instead of passing x as location parameter, pass a RooLinearVar, which shifts x by an amount that you can define.

Making this a PDF that RooFit understands was discussed in the other topic, but it looks like you are missing the “ROOT math interface” for your struct. You need to inherit from ROOT::Math::IBaseFunctionMultiDim, and implement its interface. That should look something like this.

class Vavilov_Func : public ROOT::Math::IBaseFunctionMultiDim{ 
   Vavilov_Func() {}
   
   ... implement the functions required by the interface ...
   double operator() (const double *x) { 
      return pdf.Pdf(x[0], x[1], x[2]);
   }

   ROOT::Math::VavilovAccurate pdf; 
};

Vavilov_Func vaviFunc;

RooFunctorPdfBinding vaviPdf("vavi", "Bound vavilov", vaviFunc,
    RooArgList(x,kappa,beta2));

florian_Gautier · February 26, 2020, 5:48pm

Hi @StephanH,

thanks for this help,
I am not too familiar with IBaseFunctionMutliDim so I tried this:

class Vavilov_Func : public ROOT::Math::IBaseFunctionMultiDim{ 
   

public:

  Vavilov_Func() {}
   double DoEval(const double* x) const
   {
    return x[3]*pdf.Pdf(x[0], x[1], x[2]);
  }
    ROOT::Math::IBaseFunctionMultiDim* Clone() const
   {
    return new Vavilov_Func();
   }
   unsigned int NDim()  const
   {
    return 4;
   }
   double operator() (const double *x) { 
      return x[3]*pdf.Pdf(x[0], x[1], x[2]);
   }

   
   ROOT::Math::VavilovAccurate pdf; 
};

But obviously it is not adequate and returns me errors. Could you explain me a little bit more on the requirements of this class heritage.

StephanH · February 27, 2020, 8:45am

Hi @florian_Gautier,

it looks ok apart for the fact that when cloning, you have to invoke the copy constructor instead of a new empty Vavilov_func. What are the errors?

florian_Gautier · February 27, 2020, 3:44pm

Hi @StephanH,

Here is the updated code that is running, the main issue from previous code was the fact that:

note: candidate function not viable: 'this' argument has type 'const
      ROOT::Math::VavilovAccurate', but method is not marked const
   double Pdf (double x, double kappa, double beta2);

so I changed the code to:

   double operator() (const double *x) { 
      return x[3]*pdf.Pdf(x[0], x[1], x[2]);
   }

      unsigned int NDim()  const
   {
    return 4;
   }
    double DoEval(const double* x)  const
   {
    return x[3]*pdf.Pdf(*x);
    
   }

Which do not produce error. However this seems not to be the solution as the fitting:

// Vavilov X Gauss

  RooRealVar mg2("mean","mean",mean,0.1*mean,5*mean) ;
  RooRealVar sg2("sigma","sigma",sigma,0.1*sigma,5*sigma) ;
  RooGaussian gauss2("gauss","gauss",x,mg2,sg2) ;

  RooRealVar kappa("kappa","kappa vavilov",0.1, 0.01,10);
  RooRealVar beta2("beta2","beta vavilov",0.01,0,0.5);
  RooRealVar amp("Amp", "Amplitude Vavilov",100);


  Vavilov_Func vaviFunc;

  RooFunctorPdfBinding vaviPdf("vavi", "Bound vavilov", vaviFunc,
    RooArgList(x,kappa,beta2,amp));

  RooFFTConvPdf vxg("vxg", "Vavilov (X) gauss", x, vaviPdf, gauss2);
  vxg.fitTo(dh);

The result seems to not take into account the parameters range as kappa fitted is 0 so out of my range…

Is there something still wrong?

StephanH · February 27, 2020, 4:00pm

This is probably what’s wrong. It ignores kappa and beta. you need to Implement it like operator().

florian_Gautier · February 27, 2020, 4:04pm

I guess indeed bud do you see how I can solve this issue then?

error: no matching member function for call to 'Pdf'
    return x[3]*pdf.Pdf(x[0], x[1], x[2]);
                ~~~~^~~
/usr/local/etc/../include/Math/VavilovAccurate.h:172:11: note: candidate function not viable: 'this' argument has type 'const
      ROOT::Math::VavilovAccurate', but method is not marked const
   double Pdf (double x, double kappa, double beta2);
          ^
/usr/local/etc/../include/Math/VavilovAccurate.h:162:11: note: candidate function not viable: requires single argument 'x', but 3 arguments
      were provided
   double Pdf (double x) const;
          ^

while working in the operator() part is does not work in DoEval

StephanH · February 27, 2020, 4:13pm

Oh, I see.

Don’t implement operator(). The base class already does that for you. Just delete it entirely.
For DoEval(): The VavilovAccurate::Pdf is unfortunately not marked const, since it sets members. You can circumvent the constness checks by making it mutable, by holding a pointer to the object instead of the instance itself, or by applying a const_cast. The fastest and hackiest is probably mutable.

florian_Gautier · February 27, 2020, 7:54pm

Indeed, this seems to allow bypass the const check: mutable ROOT::Math::VavilovAccurate pdf;

My current code for fitting is thus:

  RooRealVar mg2("mean","mean",mean,0.1*mean,5*mean) ;
  RooRealVar sg2("sigma","sigma",sigma,0.1*sigma,5*sigma) ;
  RooGaussian gauss2("gauss","gauss",x,mg2,sg2) ;

  RooRealVar slope("slope","slope",sigma, 0.1*sigma,10*sigma);
  RooRealVar offset("offset","offset",-mean,-5*mean,0.);
  RooLinearVar y("xbis", "x shifted", x,slope,offset);

  RooRealVar kappa("kappa","kappa vavilov",0.1, 0.01,10);
  RooRealVar beta2("beta2","beta2 vavilov",0.01,0,0.4);
  RooRealVar amp("Amp", "Amplitude Vavilov",100,50,h1bis->GetEntries());

  Vavilov_Func vaviFunc;

  RooFunctorPdfBinding vaviPdf("vavi", "Bound vavilov", vaviFunc,
    RooArgList(y,kappa,beta2,amp));

  RooFFTConvPdf vxg("vxg", "Vavilov (X) gauss", x, vaviPdf, gauss2);

Did I do something wrong?
Because my fit seems awful. The fitting should be clause to the one from a Landau x Gaussian fit id est a kappa close to 0.01 but I obtain this VavilovGaussian.pdf (33.6 KB)
The Landau Gaussian fit looks like this: LandauGaussian.pdf (33.5 KB)

StephanH · February 28, 2020, 10:19am

I don’t see an error, but evidently, the fit doesn’t converge. Here are some ideas:

Check all parameters. Check that they are not at their limits, and check there’s no typos or unreasonable limits.
Maybe you just have to find proper starting values for some parameters.
Plot both vavilov and gauss before/after the fit to make sure the convolution runs on well-defined functions.
Also assure that the convolution has enough bins to be accurate, and make sure that the functions fall to zero at the edges. Otherwise, the FFT convolution will not work. See RooFFTConvPdf for details.

florian_Gautier · February 28, 2020, 11:13pm

  x.setBins(10000,"cache") ; 
  cout<<x.getBinning()<<endl;

Is this the right command? And should I see printed 10000?

Because it stay at the initial 100 bins in my case.

StephanH · March 2, 2020, 8:07am

You should see it only if you ask for the correct binning.

x.getBinning("cache")

florian_Gautier · March 4, 2020, 8:24pm

I think my issue comes from the RooLinearVar to get my function in the range of definition of vavilov function. I tried to fit a vavilov to a gaussian but could not manage to. So lets take this example of a gaussian

  Double_t mean   = 10;
  Double_t sigma  = 4 ;
  Double_t minX= 0;
  Double_t maxX = 40;
  RooRealVar x("E","E", minX, maxX);

  RooRealVar mg2("mean2","mean2",mean) ;
  RooRealVar sg2("sigma2","sigma2",sigma) ;
  RooGaussian gauss2("gauss2","gauss2",x,mg2,sg2) ;
  RooDataSet* data = gauss2.generate(x,10000) ;

  RooRealVar kappa("kappa","kappa vavilov",1,0.01,10);
  RooRealVar beta2("beta2","beta2 vavilov",0.2,0,1);

  Vavilov_Func vaviFunc;
  RooFunctorPdfBinding vaviPdf("vavi", "Bound vavilov", vaviFunc,RooArgList(y,kappa,beta2));
  vaviPdf.fitTo(*data);

From plotting the Tmath::Vavilov I found out that the function is defined on -5,50 (see attached VavilovTmath.pdf (13.8 KB) ) which comes from precision requirement if i am not wrong. Actually, those limits vary depending on kappa and beta2 -5,50 is the biggest range, and outstide the pdf becomes 0. For gaussian limits it would be worse as vavilov function in root is defined only on a limited range.

As vavilov’s limit is gaussian for kappa>10, fitting a gaussian by another close to gaussian should work well… However, vavilov gaussian limit fits only in a certain range thus, as we discussed quite early on, using the RooLinearVar as solution.

  Double_t a,b;
  RooRealVar slope("slope","slope",a);
  RooRealVar offset("offset","offset",b);
  RooLinearVar y("xbis", "x shifted", x,slope,offset);

But I can’t manage to find a & b adequate for Gaussian (would be worse in the case of my own data) as I get the following error:

WARNING:Minization – RooMinimizerFcn: Minimized function has error status.
Returning maximum FCN so far (-1e+30) to force MIGRAD to back out of this region. Error log follows
Parameter values: beta2=0.260287, kappa=9
RooNLLVar::nll_vavi_gauss2Data[ paramSet=(beta2,kappa) ]
getLogVal() top-level p.d.f evaluates to zero

and a failed to fit. By cheating I would need to make the gaussian parameters through the RooLinearVar, to equals the expected Vavilov gaussian one:

σ2=(1−β2/2)κ
mean=gamma-1-ln(κ)-β2

But even then those varies depending of the parameters of the vavilov function trying to fit.

Is there a way to get passed this issue ? Am I doing it the wrong way? please correct me

Thanks

StephanH · March 10, 2020, 12:07pm

Hi @florian_Gautier,

I cannot follow yet. Do I understand correctly that you want to replace the Vavilov by a Gaussian for certain values of the parameters?
If so, the documentation says:

For values κ>10, the Gauss approximation should be used
with μ and σ given by Vavilov::Mean(kappa, beta2)
and sqrt(Vavilov::Variance(kappa, beta2).

You can program this into your Vavilov_Func class, by just returning the value of the Gaussian distribution with those parameters.

And yes, the problem that you are seeing at the moment is that vaviPdf evaluates to zero in a region where there’s still entries in the dataset. Maybe using the Gaussians starting from kappa >~ 9 will fix this. Apparently, the uncharted territory in your case starts before kappa = 10.

florian_Gautier · March 14, 2020, 3:57pm

Hi @StephanH ,

No, it is not exactly what I intend to do. In my final code, I would like to be able to fit data that follow Gaussian, Vavilov or Landau distributions using a Vavilov function. I will just observe bad fits in extrema cases.
Thus, if i try fitting Gaussian data by the Vavilov function, I expect to have the parameters converging to the limit of the Vavilov parameter (k->10). Same, If I try fitting Landau data by a Vavilov function, I expect the parameters to tend to the other limits (k->0.01). This is the fit I would like to see.

But currently, those fits seem to simply fails because of my inaptitude to determine the RooLinearVar for the data to ones that accommodate the Vavilov function’s range for fit. As explained, rather than using my own data that are more complicated, I tried to work on generated distributions. First, the Gaussian (see above). But I failed to obtain the fit from what I think is due to the RooLinearVar and Vavilov range of definition that are not good.

(I will also include separate fits of data to Gaussian or Landau functions and compare the chi2 to determine if Vavilov fit is more relevant or not to use but that would be the next step when the Vavilov fitting works ^^)

StephanH · March 16, 2020, 10:47am

Hi Florian,

did you try to set reasonable starting values and ranges for the linear var?
From this

I read that shift and slope receive an undefined value (since a and b are not initialised), and they are forced to be constant, as they have no range defined.
Try maybe

//Don't give a range if you want it to be constant forever
RooRealVar slope("slope", "slope", 1);
// Start at zero, allow to vary from -10 to 10
RooRealVar offset("offset","offset", 0, -10, 10);
RooLinearVar y("xbis", "x shifted", x,slope,offset);

You might have to find the proper values for the amount of the shift.

florian_Gautier · March 16, 2020, 3:59pm

Hi Stephan

Sorry in my message from 12 days ago I explained the structure I use for the RooLinearVar and my difficulty to find the right ranges, but the actual line for a test looks like this:

 RooRealVar slope("slope","slope",1.,0.,5.);
  RooRealVar offset("offset","offset",0.1,-5.,5.);
  RooLinearVar y("xbis", "x shifted", x,slope,offset);

However, I still get:

[#0] WARNING:Minization -- RooMinimizerFcn: Minimized function has error status.
Returning maximum FCN so far (3383.69) to force MIGRAD to back out of this region. Error log follows
Parameter values: beta2=0.483538, kappa=8.09847, offset=0.0770637, slope=1.0246
RooFunctorPdfBinding::vavi[ function=0x7ffe8c36c558 vars=(xbis,kappa,beta2) ]
     getLogVal() top-level p.d.f evaluates to zero @ vars=(xbis = -4.06362,kappa = 8.09847 +/- 2.54793,beta2 = 0.483538 +/- 0.324798)
RooNLLVar::nll_vavi_gauss2Data[ paramSet=(beta2,kappa,offset,slope) ]
     function value is NAN @ paramSet=(beta2 = 0.483538 +/- 0.324798,kappa = 8.09847 +/- 2.54793,offset = 0.0770637 +/- 3.24783,slope = 1.0246 +/- 1.31111)

So I think providing you the code I run might be easier Test2.C (2.3 KB)

StephanH · March 17, 2020, 11:53am

Hi Florian,

the problem is likely that for small or large E, the vavi is zero. You cannot compute the log-likelihood of a data point for which the PDF is zero.
You likely have to invent some non-zero values for the tails of the Vavilov distribution, e.g. hard-code them to 1.E-300 or something.
I hacked it like this:

  double DoEval(const double* x)  const {
    double val = pdf.Pdf(x[0], x[1], x[2]);
    return std::max(val, 1.E-300);
   }

florian_Gautier · March 20, 2020, 4:16pm

Thanks for this fix.

By replacing in the Test2.C, I still get a NOT POSDEF as final status. How can I solve this?

StephanH · March 23, 2020, 11:18am

This happens e.g. when parameters are strongly correlated. Did you have a look at the correlation matrix? You can e.g. decorrelate parameters by transforming {a, b} to {a+b, a-b} or to {a, a + deltaA}, or you can set one parameter constant.