Hi rooters!
I’m working with RooFit on an extended unbinned simultaneous maximum likelihood fit on three bins and I’m running some montecarlo toys to perform some statistical studies.
I have two signal pdfs with fixed parameters, the only free signal parameters are the number of events in each bin. A gaussian constraint is applied to this parameters.
There are alse several background pdfs, with constraints applied to them.
I found some unexpected behaviours when I generate and fit toys with few signal events (or 0).
The fit output seems to fluctuate more towards the unphysical region (negative number of fitted signal events) and sometimes I get a negative fitted number of signal events with a high absolute value (e.g. -100), but the fit seems to converge perfectly and also the covariance matrix quality is = 3.
I noticed that in these cases the number of invalid NLL (retrieved with the numInvalidNLL method) tends to be quite high ( > 50, while usually it’s below 5); it seems that the fit ends up in an unphysical region and can’t get out of it.
Sometimes the covariance matrix quality is not 3 but lower and also in this case the fitted number of signal events ends up far in the unphysical region.
I attached a plot that shows the distribution of the total number of fitted signal events for the two signals (obtained summing the output from the three bins) when I generate 10000 toys with 0 signal events. As you can see there is a bulk centred in (0,0) (as expected) but also a lot of fluctuations in the unphysical region.
I was wondering how reliable the fit results can be when I get a perfect convergence but a high numInvalidNLL and when I get a covariance matrix quality different from 3.
Thank you for your help.
Cheers,
Fabio