Roofit problem

nijusan · June 5, 2009, 10:57pm

Hello, I’m having a strange problem with roofit. I’m trying to do a simultaneous fit to 3 different variables. The PDF is is rather complex, being built up from 6 individual components. The PDF is constructed in the following manner:

comp1 = comp1_var1comp1_var2comp1_var3;
comp2 = comp2_var1comp2_var2comp2_var3;
comp3 = comp3_var1comp3_var2comp3_var3;
comp4 = comp4_var1comp4_var2comp4_var3;
comp5 = comp5_var1comp5_var2comp5_var3;
comp6 = comp6_var1comp6_var2comp6_var3;

where each component would stand for signal shape, background shape 1, background shape2… etc.

I then add each component together to form the final extended PDF:

sigbkg = N1comp1+N2comp2+N3comp3+N4comp4+N5comp5+N6comp6;

Everything seemed fine and the fits looked very reasonable, but when I went to do a quick sanity check things started to fall apart.
I have truth matched MC for each of the 6 components outlined above, and as such I know the exact number in each distribution. When I perform an extended maximum liklihood fit to the signal MC I would expect to be able to make a reasonable comparison between the numbers returned from the fit to the known numbers from the truth matched sets. However, I’m off by several sigma even as high as 7 in some cases. I know the individual components aren’t to blame. I do describe a couple of the shapes with my own functions rather than the built in roofit functions, could this have a strange impact on the normalization? Do I need to provide a mechanism to force my functions to return 0 outside of their respective ranges? I’ve been up and down this problem and I can’t seem to figure out what is going on.
One other thing. When I turn off the extended option I get much more reasonable values, however the errors go rampant returning values of the same order of magnitude as the fit values themselves.

Wouter_Verkerke · June 10, 2009, 10:10am

Hi,

With the information provide it is difficult to make statement on where the problem might be.

What I understand from your posting is that the fit to the summed distribution works OK in extended mode, but that the breakdown by components doesn’t follow that from your truth MC information. There are various reasons this, can happen

[
That your fit does not converge without extended mode option is not surprising,
you should formulate your RooAddPdf in terms of fractions instead of number of events, otherwise there is an unconstrained dof in the fit, hence the lack of convergence.
]

Did you try to run a toy MC study (i.e. generate data from the p.d.f and see if that returns what you put in). This will test to first order if the model is ‘sane’. If this test passes your model is ‘technically’ OK. It might still not be able to fit your data according to the breakdown of your truth MC information, but this might be a more fundamental issue (e.g. the provided shapes cannot describe your mc)

Wouter