Effect of the Conditional() modifier in RooProdPdf

mo1 · August 19, 2009, 2:40pm

Hi,

Can anybody (hi Wouter!) explain the effects of the Conditional() modifier in RooProdPdf? From a mathematical point of view, there shouldn’t be a need for this? One could just juse a “regular” product of PDFs (that is RooProdPdf w/o the modifier) to multiply the distribution of a variable in order to “upgrade” a conditional PDF to a full 2d PDF.

I noticed, however, it makes a difference when drawing data from the full PDF. I have a 4d PDF, 3 regular variables and one entering through the above multiplication. When I use the Conditional() modifier, everything is fine. If not using it, RooFit warns me the generated distributions might not be accurate.

Thanks,

Moritz

Wouter_Verkerke · August 20, 2009, 7:44am

Hi Moritz,

Here is the explanation:

Given two p.d.f.s F(x,y) and G(y). Model F can be used in multiple ways as
RooFit p.d.f. make no assumption on what variable are observables,

F(x,y) = f(x,y) / int f(x,y) dx dy
F(x|y) = f(x,y) / int f(x,y) dx
F(y|x) = f(x,y) / int f(x,y) dy

where f(x,y) is the ‘raw’ unnormalized value of F.
If you now construct a product of F(x,y) and G(y), one can also do this in multiple
ways:

 F(x|y)*G(y) = [ f(x,y) * g(y) ]  / [ Int f(x,y) dx * Int g(y) dy ]
 F(x,y)*G(y) = [ f(x,y) * g(y) ]  / [ Int f(x,y) *g(y) dx dy ]

The first one sees F(x|y) as p.d.f for x given a value of y and G(y) as the p.d.f of y.
This form is constructed with the Conditional() modifier and is the most sensible way
to construct the product.

The second one, sees the product of the ‘raw’ values f(x,y)*g(y) as the 2-d p.d.f. for x and y. The resulting y distribution is now a combined effect of p.d.f.s F and G. This
form is only rarely useful. But note that if f(x,y) is flat in y it results in the same
function as the conditional case.

There are also some practical differences. In the conditional case, the event generation
can be factorized: first I generate a value for y from g, then a value for x from f, given
that value of y. In the second case this is not possible, and an accept-reject samping
in the 2-D phase needs to happen, which is less efficient. If you have additional variable
(as you do), this may push it into a 3 or 4D phase space, which are progressively
more difficult to sample accurately (as indicated by the warning message).

Wouter

mo1 · August 20, 2009, 8:05am

Thanks Wouter, this actually helps a lot!