Binning dependence of a fit

David_Vannerom · December 23, 2021, 3:34pm

Dear all,

I am fitting a histogram with a Landau function, but I am now wondering what is the dependence of the binning on the fit result and performances. Naively, I would hope that there is none, that I could build a histogram with a very large number of bins, such that each bin only contains 1 event so that I get rid of the binning dependence. But I’m not sure that is true and also, when doing that, the fit fails.

My question is then: is there a way to fit “unbinned data”? I see there is something on this topic in the manual but I’m not sure it means the same thing.

If that’s not possible, is there a recommendation, or a good habit to apply when binning a histogram for fitting?

Thanks!
David

David_Vannerom · December 24, 2021, 10:19am

Hello,

I have made this small script to test the UnBinData method, but the fit fails for some reason. Could someone take a look at it? It’s trying to fit a Landau to a set of dE/dx (stopping power) values corresponding to ionization losses of a charge particle in silicon.

Thanks a lot!
David
unbinnedTest_dEdx.cxx (1.4 KB)

eguiraud · December 26, 2021, 9:16am

Hi @David_Vannerom ,
we need @moneta 's or @jonas 's help here, let’s ping them – but things might be slow during the holiday season.

Cheers,
Enrico

dastudillo · December 26, 2021, 9:53am

Have you tried using a TGraph? Graphs are not binned and they have a Fit method.

David_Vannerom · December 28, 2021, 2:17pm

Hello all,

I don’t think fitting a TGraph would make any difference. It’s not binned, but each element in the y vector corresponds to the content of a bin in the associated histogram, so it’s effectively binned. The choice of the x vector elements defines the binning.

Cheers,
David

David_Vannerom · January 3, 2022, 9:46am

Hello all,

Any new idea on this matter?

Thanks!
David

RENATO_QUAGLIANI · January 3, 2022, 11:13am

I do wonder if you better use RooFit itself, and pass in an unbinned Dataset.
As far as i can tell the RooLandau in roofit exists but it doesn’t support analytical integration.
In attachment a code written by @Da_Yu_Tou , which extend the Landau PDF to be analytically integrable, in case it can be of interest. Maybe some RooFit maintainer could check if the implementation I am providing here could be of any help in the next release.

RooLandauAnalytical.cpp (3.4 KB)
RooLandauAnalytical.h (2.2 KB)

moneta · January 5, 2022, 5:09pm

Hi,

The problem is that the default unbinned likelihood fit is not extended, and you are having a constant parameter in your Landau function. You can perform an extended fit by doing:

fitter.LikelihoodFit(data, true);

After this change, your macro works for me

Lorenzo

David_Vannerom · January 5, 2022, 11:08pm

Hey Lorenzo,

Thanks, that made the trick! Just for my information, what exactly does it mean to have an extended fit?

Cheers,
David

moneta · January 6, 2022, 8:24am

See eq. 40.11 of the statistics review of the PDG (https://pdg.lbl.gov/2020/reviews/rpp2020-rev-statistics.pdf)

Lorenzo

David_Vannerom · January 6, 2022, 10:49am

Right, thanks!

Also, I notice that now, my fits “works” (converge and make sense) almost each time (I’m fitting many histograms), while it was more often failing with the binned approach (described here Chapter: FittingHistograms before 7.7). I’m wondering: is there a reason NOT to use this UnBinnedData method? It seems to me that it would always be the best you can do given the fact that you’re not including an arbitrary factor that is the choice of the binning.

Cheers,
David

moneta · January 6, 2022, 1:14pm

Hi

Yes the un in ed one is the best approach you can do for parameter estimation.
The drawback is the need to normalise the fitting function and this can be in some case more computational expensive and the need to evaluate the function on all data points.
Cheers

Lorenzo

system · January 20, 2022, 1:14pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.