Weights in Fit

karineklund · April 14, 2009, 8:53am

Hello!

I wish to fit a straight line to some data-points. I have filled my points into a TH1F and I perform:

TF1 *f_fit = new TF1("f_fit","[0]+[1]*x"); h_0->Fit("f_fit");

I whish to give my points less importance with increasing x, which also means a large deviation from y(x)=1. Have can I achieve this?

I also wonder if I can extract some information about how well the fit suits the data!

Thanx in advance!

Axel · April 14, 2009, 1:34pm

Hi,

what you are looking for is either hand-crafted errors or a hand-crafted minimization function. Either way it’s impossible to help you without knowing how exactly the points at larger x should have less importance. We’ll probably need the full story to be able to give a hint what to do: what is measured, where do the uncertainties come from, why don’t you like some of the points.

Cheers, Axel.

karineklund · April 15, 2009, 7:03am

Hi Axel (and others)!

My points are ratios of electron fluence, so my x is energy (MeV) and y has dimension 1. The histograms of the fluence is shown in the first attached file. I wish to look at the ratio of the red and the black histograms. The ratio is shown in the second attached file. When the energy approaches the maximum electron energy the fluence is very low, so the points will be almost “0/0”. Very small differences in fluence give large difference in ratio at these energies (x-values) but these points are not very relevant for the general trend in fluence ratio.

My greatest concern is that I don’t fully understand how the Fit-function works. What do the default settings give me? And how can I change the defaults? And how can I extract information about the results 8succes) of the fit.

/Karin

brun · April 15, 2009, 7:30am

Where are the errors on your data points? without errors the fit is meaningless. By default teh fit will assume a statistical error being the sqrt(bin contents). Of course in your case this does not make any sense

Rene

karineklund · April 15, 2009, 12:42pm

If I set the error on the original histograms, will it propagate correctly to the ratio of them?

Axel · April 15, 2009, 12:47pm

Hi,

yes. And the fit will “trust” the points depending on their uncertainties. So if you have a way to determine or at least model the original histograms’ uncertainties then that’s the way to go. If your histograms contain counts then the errors might be simply Poisson errors, and calling hist->Sumw2() before filling them would be sufficient.

Cheers, Axel.

karineklund · April 15, 2009, 2:37pm

How does this “trust” work? 1/error?

Axel · April 15, 2009, 3:26pm

Hi Karin,

what’s minimized is the sum of the (difference[i]/error[i])^2, where the difference is the difference of the point and the fit function’s value, summed over all points i.

Cheers, Axel.

karineklund · April 16, 2009, 5:00pm

Thanks for all help! I still have a few questions on the subject. What happends to the errors if I rebin my hisograms? I also still wonder how I can know how well the fit suits the data, can I extract som information about that?

/Karin

karineklund · April 17, 2009, 12:16pm

Hello again!

I have succeded quite well with the fit I asked about at first, thank you for all help with that!
Now I have another fit that I am confused over as well. The figure with data-points and fitted line is attached at the bottom. I do:

TF1 *f_fit = new TF1("f_fit","[0]+[1]*x"); TGraph *g_RFvsSF = new TGraph("RFvsSF_Varian4MV_20cm2.txt"); g_RFvsSF->Draw("AP"); g_RFvsSF->Fit("f_fit","WW",""); f_fit->Draw("L same");

And in the statistics-window I get the information (see also attached figure):

[quote]X2/ndf 0.02624 / 325
Prob 1
p0 0.9683±0.001208
p1 0.1345±0.003135[/quote]

However in the terminal I read:

[quote] **********
** 5 **MIGRAD 5000 0.000327

MIGRAD MINIMIZATION HAS CONVERGED.
MIGRAD WILL VERIFY CONVERGENCE AND ERROR MATRIX.
FCN=0.0262358 FROM MIGRAD STATUS=CONVERGED 31 CALLS 32 TOTAL
EDM=1.31347e-023 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 p0 9.68291e-001 1.34283e-001
2 p1 1.34482e-001 3.48373e-001
FCN=0.0262358 FROM MIGRAD STATUS=CONVERGED 31 CALLS 32 TOTAL
EDM=1.31347e-023 STRATEGY= 1 ERROR MATRIX ACCURATE
EXT PARAMETER STEP FIRST
NO. NAME VALUE ERROR SIZE DERIVATIVE
1 p0 9.68291e-001 1.34283e-001
2 p1 1.34482e-001 3.48373e-001[/quote]

What is the difference between the “error” in the output in the terminal window and the ± information in the statistics window? How should I interpret the information about the Chi-square?

NB, I have not set any errors on these points, I use the “WW” option in my fit.

Regards,
Karin

brun · April 17, 2009, 1:52pm

If you read the documentation of TGraph::Fit at root.cern.ch/root/html/TGraph.html#TGraph:Fit
you will find the following paragraph

3) When fitting a TGraph (ie no errors associated to each point), a correction is applied to the errors on the parameters with the following formula: errorp *= sqrt(chisquare/(ndf-1))

Rene

karineklund · April 21, 2009, 9:03am

Ah, Ok, I see. Thanks for pointing this out! Would you happen to have a link to a page explaining why this correction has to be applied? Once more, thanks for your help!