Toy MC / Fitting Problem?

Stefanos21 · May 17, 2011, 5:43pm

Hello all…

I face a very tricky problem… I suspect a problem with the fitting of ROOT…

I create histograms from a defined function. Then I fit those histograms with the same function(!!) and I fill a histogram with one specific parameter from the fit… And it looks like this…

I attach also the code in order someone wants to check…

I would really appreciate your help…

Cheers,
Stefanos
toymcquestion.C (2.1 KB)

cplager · May 17, 2011, 7:08pm

Hi Stefanos,

It isn’t clear to me what you’re asking (in other words, I’m not sure what you’re expecting to happen and what actually happens). In addition in your code, you plot hgall on your canvas, but you show htzero below (so as far as I can tell, you aren’t using this script to make the output you are showing below).

Cheers,
Charles

Stefanos21 · May 17, 2011, 7:21pm

Hello Charles!

I’m getting tzero from the fit and I’m filling the htzero hist. I was expecting a nice and normal distribution… But I’m getting these very strange peaks…

And I’m very sure that the fitting is the problem… Because, if you change the binning of the histogram the peaks are moving along…

Many thanks for your reply,
Stefanos

moneta · May 18, 2011, 8:44pm

Hi,

Looking at your fit function, I see something suspicious:

double fitFunction(double *x, double *par)
{
 if (x[0] <= par[1] ) { //for four/second fit function .79, .791
      TF1::RejectPoint();
      return 0;
   }
   return (par[0]*(x[0]-par[1])*TMath::Exp(-(x[0]-par[1])*(x[0]-par[1])/(2*par[2]*par[2])));
}

You are rejecting a point on a condition depending on the fit function parameter. This makes the chi2 (or likelihood) probably not anymore a smooth function and this could cause the effect you are seeing

Best Regards

Lorenzo

Stefanos21 · May 19, 2011, 11:50am

Hello Lorenzo!

Interesting comment! So I removed the condition and I get the same effect…

Please ignore the comments that are inside the plots…

moneta · May 19, 2011, 9:09pm

Hi,

I will look into your macro, but I cannot run it. The definition of “Uniform” function is missing, (replace with
gRandom->Uniform did not work).
Please post a macro that can be run using ACLIC

Best Regards

Lorenzo

Stefanos21 · May 19, 2011, 9:19pm

It is working for me with
.L toymcquestion.C+
run()
although it does give an error it is producing good numbers…

I will try to post a macro that will work with ACLIC…

Many thanks again,
Stefanos

moneta · May 19, 2011, 9:45pm

Doing exactly this it does not work for me

root [0] .L ~/Downloads/toymcquestion_original.C+
Info in <TUnixSystem::ACLiC>: creating shared library /Users/moneta/Downloads/toymcquestion_original_C.so
In file included from /Users/moneta/Downloads/toymcquestion_original_C_ACLiC_dict.h:34,
                 from /Users/moneta/Downloads/toymcquestion_original_C_ACLiC_dict.cxx:17:
/Users/moneta/Downloads/toymcquestion_original.C: In function ‘void run()’:
/Users/moneta/Downloads/toymcquestion_original.C:55: warning: unused variable ‘c6’
/Users/moneta/Downloads/toymcquestion_original.C:81: warning: unused variable ‘c5’
root [1] run()
Error in <TFormula::Compile>:  Bad numerical expression : "Uniform"
Error in <TF1::TF1>: function: uniform/Uniform has 0 parameters instead of 1
Warning in <TF1::GetRandom>: function:f1 has 42 negative values: abs assumed

 *** Break *** segmentation violation

There is no “Uniform” function pre-defined in ROOT. You must have something defined or run before

Lorenzo

Stefanos21 · May 21, 2011, 5:07pm

I just fixed it…

Please let me know if it still doesn’t work…

Many thanks again,
Stefanos
toymcquestion.C (2.1 KB)

moneta · May 22, 2011, 10:18am

Hi,
Thanks for the macro, now I can run it.
I don’t understand why you are generating the toys in this way. You are generating first a random value of your parameter of interest in a uniform distribution and then you generate the data you are fitting based on that parameter value. Furthermore, your function (which is your pdf) becomes negative for x < p.
If you want to check your fitting, you should fix the parameter and then generate the toy.
I am not sure what is you want to do with the toys…

Anyway, the fitting looks fine to me. Maybe in few % of the case the fit failed and the parameter do not make mush sense, but it is not the cause of your observed distribution.

Best Regards

Lorenzo

giakov · May 23, 2011, 3:04pm

Dear all

If you check also this : [url]Root And Roofit Fit Issue
is exactly the same problem. The issue is that for an unclear reason the fit on the
"cross-zero" point doesn’t “want” to vary inside the bin but it takes always the first
point which is not the right case.

Regards, George

moneta · May 24, 2011, 7:19am

Hi,

Are you sure that one is the right function for your model ? The fact that the function has a discontinuities in the derivatives can cause also a problem. Then you are also depending a lot on the origin of your histogram. I would also try to use the integral of the bin content for fitting (option “I”).
Lorenzo

giakov · May 24, 2011, 8:07am

Hi Lorenzo

Let’s suppose that’s not the right model for fitting, I would expect the linear function I did to give a better fit on the rising of the histogram like least square method does. Also since this appears on the Toy MC, me and Stefanos did, I guess the origin of the data on histogram would not affect that.

I tried your trick with the option “l” but no changes.

Thanks
George

moneta · May 24, 2011, 12:39pm

Hi George,

Let me understand your problem better. If you are fitting with a least square method with a straight line (in ROOT is the default, when fitting binned data) you do not see this effect, while when fitting using a
Poisson likelihood fit (option “L”) you see this effects ? Or by least square you mean a custom linear fit done by yourself ? What do you do for the empty bins on the left, where the function becomes negative in that case ?

Lorenzo

giakov · May 24, 2011, 3:32pm

Hello Lorenzo

First of all thanks for following this and helping here.

So what I am doing is: For the same number of bins I am fitting with a linear function and the result is the red dashed line in the histogram in my thread. This fit is done using roofit also with the option “l” you mention without big changes. Also Stefanos is doing the same using root to fit. So there is the effect.

After that I am calculating the the same function (a*x+b) parameters with the least square method and plotting this with the blue line. I also tried a smearing in x for least square method. No bumps observed with this method.

Empty bins on the left are excluded as I fit in a specific range for both methods, so only positive bins are taken into account.

So I was really wondering why with a fit you see this kind of effect but instead calculating the theoretical line you don’t.

Thanks in advance

George

moneta · May 25, 2011, 7:43am

Hi,

I still do not understand wht the blue and red line are different. If you don’t use option “L” (likelihood) the method used by default in TH1::Fit is a least square method and it is solved directly using a matrix inversion, in
case of a linear function. So, it should give the same result as your direct computation.
Can you please post your code, so I can understand better where these differences are coming from,

Cheers, Lorenzo

giakov · May 25, 2011, 8:35am

Hi Lorenzo

I send you by email scripts and a piece of data file to run on it. I could’t
attach it here because 2mb allowed maximum.

Thanks, George