The approximating posterior distribution needed in ensemble learning is over all the possible hidden state sequences and the parameter values . The approximation is chosen to be of a factorial form

The approximation is a discrete distribution and it factorises as

The parameters of this distribution are the discrete probabilities and .

The distribution is also formed as a product of independent distribution for different parameters. The parameters with Dirichlet priors have posterior approximations of a single Dirichlet distribution like for

or a product of Dirichlet distributions as for

These will actually be the optimal choices among all possible distributions, assuming the factorisation .

The parameters with Gaussian priors have Gaussian posterior approximations of the form

All these parameters are assumed to be independent.