Skip to content

Automatic marginalization

Given a joint distribution p(dx,dy)=p(dx)p(dyx), marginalization is the computation: p(dy)=Xp(dx)p(dyx). That is, we marginalize out x (as indicated by the X) to obtain the marginal distribution over y.

When x is a discrete-valued random variable, we may instead write the integral as a sum: p(dy)=xXp(x)p(dyx).

Tip

Automatic marginalization is supported for standard conjugate forms, linear transformations, and sums and differences of discrete random variables.

A random variable is trivially marginalized out whenever it does not yet have a value, but has a distribution associated with it via the assume (~) operator. This is trivial because, unless that random variable is subsequently used, it will never be simulated, and so is marginalized out. More interesting are situations where a random variable is used, but it still remains marginalized out. Consider:

x ~ Beta(2,0, 2.0);
y <~ Bernoulli(x);
Here, x can be marginalized out and y simulated from its marginal distribution, as the beta-Bernoulli form of the joint distribution between x and y is recognized. Specifically, the code defines p(dx,dy)=p(dx)p(dyx), and from this, automatic marginalization computes p(dy), a beta-Bernoulli distribution, which was not explicitly defined.

Similarly, we could use:

x ~ Gamma(2.0, 1.0);
y <~ Poisson(x);
because the gamma-Poisson form is recognized, or:
x ~ Gaussian(0.0, 4.0);
y ~ Gaussian(x, 4.0);
z <~ Gaussian(y, 4.0);
because chains of Gaussians are recognized. In this latter case, both x and y can be marginalized out.