TGraph input

radbalint · August 12, 2008, 1:33pm

Hi Wouter,

another (perhaps naive) question: can the input to RooDataHist be a TGraph instead of a TH1?

The reason is that in ROOT there are no asymmetric errors for TH1 (basically RooFit delivers it…:-)) as far as I know. But then one could use a TGraph and calculate (with whatever sophisticated methods) his/her asymmetric errors and then feed this into RooFit (and force RooFit to use the input error bars) via TGraph and then build up the final data+backg model out of this matter.

Is there a plan for TGraph as in input?

Cheers,
Balint

brun · August 12, 2008, 2:05pm

I assume that you mean TGraphAsymmErrors

Rene

radbalint · August 12, 2008, 2:30pm

Hi Rene,

yes, I meant calculating the asymmetric errors with TGraphAsymmErrors and use this as an input to RooFit and ask RooFit to use these errors.

So basically it is just that I want to represent TH1 with TGraphAsymmErrors so that I have the possibility to use (if possible) asymmetric errors for ‘histogram bins’ as an input and in this wayalso avoid the roundings of non-integer bin contents due to Poisson error calculation - if I understand correctly.

Thanks,
Balint

Wouter_Verkerke · August 28, 2008, 3:52pm

HI Balint,

That should not be too difficult. Note though that errors on data points are never used when you perform an likelihood fit, only in chi2 fits. In RooFit chi2 fits, any errors that are stored in TH1 automatically imported in a RooDataHist and used. Along those lines it is relatively straightforward to make a ctor that imports a TGraph as a RooDataHist makes no assumption on uniform binning anyway.

NB: The Poisson error bars that RooFit draws by default are merely for visualization (you can request the ‘usual’ SumW2 error bars through the option DataError(RooAbsData::SumW2) ), this does not affect a fit in any way.

Wouter

radbalint · September 6, 2008, 7:55pm

Hi Wouter,

Thanks for the info. To be honest I think I need to write down what I am doing: I am trying to do a multivariate analysis using likelihood ratio method to extract the ratio of signal events from some distributions. Now since I am limited in statistics I have very large statistical errors on the bin contents of my likelihood ratio output histogram from the MVA. In such situation one can do toy monte carlo distributions (with RooFit) after fitting the LR outputs(for bg and signal test samples, etc) with some polynomials and do the fractional fit for the signal and background using these toy monte carlos. One may also simulate the fluctuation of the number of events of the toy monte carlos, etc.
So far, it is - let’s say - okay. I am interested in the details of how
these fractional fits are being done.

If I understand correctly the frational if it is similar to any other fit but there are additional parameters (the event count coefficients for sig and bg), and/or additional Likelihood terms in a Likelihood fit in order to take into account additional constraints.
As I read in the RooFit manual: a Poisson fluctuation on the total number of events generated can be requested in the extended likelihood formalism.

Now my problem is that since I am very limited in statistics (in the physics event generator), the polynomial fits on the original LR output distributions are quite uncertain.
For each LR output bin there will be an additional error which (from Gaussian error propagation, or from Taylor expansion) is the: transpose of the derivative vector of the fit polynomial, times the fit cov. matrix, times the derivative vector of the polynomial (derivative wrt the fit parameters), evaluated at each LR output value in each bin.

So, my question is: how can I propagate these additional ‘systematic fit errors’ into the fractional fit in RooFit (or in TFractionFitter)? This is not equivalent with e.g. Poisson fluctuation on the total number of events generated (if I understand correctly), this is coming from the fact that I have a small number of physics events generated originally (that I use to train my multivariate analysis to obtain the LR output) which leads to a “constant” uncertainty in the LR output shape, and subsequently in the toy monte carlo shapes. It is there even if I increase the number of events generated in the toy monte carlo to infinity.
Can it be “modelled” with this additional fluctuation feature? Or can a “chi2 fractional fit” be used instead to solve this problem (because it uses the errors on data points)? I think under any circumstances I need to take this error into account in my fractional fit procedure. How can I do that? I thought I can use (in the most general way) the TGraphAsymmErrors to add these additional systematical errors to the statistical bin errors? But they are not used in a likelihood world…but I would still need them to be used :).

Or I totally misunderstand something?

Thanks for the help,
Balint