Bayesian methods for data analysis

Inclusion of the effects of noise into the model of a time series leads to the world of statistics -- it is no longer possible to talk about exact events, only their probabilities.

The *Bayesian framework* offers the mathematically soundest basis
for doing statistical work. In this chapter, a brief review of the
most important results and tools of the field is presented.

Section 3.1 concentrates on the basic ideas
of Bayesian statistics. Unfortunately, exact application of those
methods is usually not possible. Therefore
Section 3.2 discusses some practical approximation
methods that allow getting reasonably good results with limited
computational resources. The learning algorithms presented in this
work are based on the approximation method called *ensemble
learning*, which is presented in
Section 3.3.

This chapter contains many formulas involving probabilities. The
notation is used for both probability of a discrete event
and the value of the *probability density function* (pdf) of a
continuous variable at , depending on what is. All the
theoretical results presented apply equally to both cases, at least
when integration over a discrete variable is interpreted in the
Lebesgue sense as summation.

Some authors use subscripts to separate different pdfs but here they are omitted to simplify the notation. All pdfs are identified only by the argument of .

Two important probability distributions, the Gaussian or normal distribution and the Dirichlet distribution are presented in Appendix A. The notation is used to denote that is normally distributed with mean and variance .