Basic usage of fits and PDF's

gwatts · April 1, 2011, 5:48pm

Hi,
I’m just now teaching myself RooFit. If I generate data from a gaussian I don’t seem to be able to fit it with another gaussian, but can with the first gaussian. The fact that I can’t do this means I have some basic misunderstanding of how RooFit is designed (I’ve ready a bunch of the manual and other tutorials and it hasn’t helped yet).

I started with the very basic example from the tutorials and modified it a bit:

[code]///
/// Generate a gaussian and fit it. This example is motivated by the basic intro
/// RooFit tutorial (http://root.cern.ch/root/html/tutorials/roofit/rf101_basics.C.html)
///

#include <RooRealVar.h>
#include <RooGaussian.h>
#include <RooDataSet.h>

#include

using namespace std;

void main()
{
///
/// Create a gaussian
///

RooRealVar x ("x", "x", -10.0, 10.0);
RooRealVar mean ("mean", "mean of gaussian", 1, -10.0, 10.0);
RooRealVar sigma ("sigma", "width of gaussian", 1, 0.1, 10.0);

RooGaussian gauss("gauss", "gaussian PDF", x, mean, sigma);

///
/// Create the data
///

auto data = gauss.generate(x, 10000);

///
/// Now do the fitting
///

gauss.fitTo(*data);

///
/// Create a second gaussian and see if we can't fit it to the gaussian we have here
///

RooRealVar y ("y", "y", -10.0, 10.0);
RooRealVar mean1 ("mean1", "mean of gaussian fit", 1, -10.0, 10.0);
RooRealVar sigma1 ("sigma1", "width of gaussian fit", 1, 0.1, 10.0);

RooGaussian gauss1("gauss", "gaussian PDF", y, mean1, sigma1);

gauss1.fitTo(*data);

///
/// And get the results
///

cout << "From the fit where we fit the origianl guassian" << endl;
mean.Print();
sigma.Print();
cout << endl;

cout << "From the fit where we fit the second unrelated guassian" << endl;
mean1.Print();
sigma1.Print();

}
[/code]

First, as above, when I run that second fit it allows y, sigma1, and mean1 to float. If I then set y to be a constant (using the SetConstant method) it will fit only mean1 and sigma1, but the values are incorrect. With I have the setConstant on y, here is the bottom part of the output:

[code] EXT PARAMETER APPROXIMATE INTERNAL INTERNAL
NO. NAME VALUE ERROR STEP SIZE VALUE
1 mean1 1.83261e-007 2.21298e-001 5.91453e-007 1.83261e-008
2 sigma1 6.05661e+000 5.95286e+000 5.00000e-001 1.75079e+004
ERR DEF= 0.5
EXTERNAL ERROR MATRIX. NDIM= 25 NPAR= 2 ERR DEF=0.5
4.898e-002 -7.586e+005
-7.586e+005 1.177e+013
ERR MATRIX NOT POS-DEF
PARAMETER CORRELATION COEFFICIENTS
NO. GLOBAL 1 2
1 0.99897 1.000 -0.999
2 0.99897 -0.999 1.000
ERR MATRIX NOT POS-DEF
[#1] INFO:Minization – RooMinuit::optimizeConst: deactivating const optimizatio
n
From the fit where we fit the origianl guassian
RooRealVar::mean = 1.00665 +/- 0.0099736 L(-10 - 10)
RooRealVar::sigma = 0.99736 +/- 0.00705248 L(0.1 - 10)

From the fit where we fit the second unrelated guassian
RooRealVar::mean1 = 1.83261e-007 +/- 0.221298 L(-10 - 10)
RooRealVar::sigma1 = 6.05661 +/- 5.95286 L(0.1 - 10)[/code]

As you can see the first mean/sigma is as expected, but the second one fit from the second histogram is not.

What have I missed here? Many thanks!

Cheers, Gordon.

gwatts · April 1, 2011, 6:45pm

Ok, if in that second gaussian definition I replace “y” with “x” then the fit works correctly. So, why is “x” so special after I’ve generated the dataset above in terms of “x”?

gwatts · April 1, 2011, 10:45pm

Ok, I’m starting to get it. You have to have an observable. I was thinking that it was implicity what your observable was - no matter what you called it.

However, RooFit is much more general than that, and you can have multiple observables if you wish - so to remove ambiguity RooFit just requires all the observables to be named. And if you have a large model you may well have many observables.

So when you call the generate method with the variable “x” you are creating a dataset that is a function of that observable. RooFit (or RooDataSet in this case) remembers its data is in terms of x. Thus, if you ask for something about “y” the RooDataSet, not having any dependence, would claim your distribution was flat.