This week introduces the decision-theoretic foundation of Bayesian inference.
We study how posterior distributions lead naturally to optimal decisions when losses or utilities are specified, and apply the theory to point estimation and hypothesis testing.
8.1 Learning Goals
By the end of this week, you should be able to:
Describe the Bayesian decision-theoretic framework.
Define loss functions and posterior expected loss.
Derive Bayes rules for common loss functions.
Apply Bayesian decision principles to estimation and classification.
Distinguish between point estimation, interval estimation, and decision-making contexts.
8.2 Lecture 1 — Principles of Bayesian Decision Theory
8.2.1 1.1 Motivation
Statistical inference often involves making decisions under uncertainty:
select an action \(a\)based on observed data \(y\).
Each action has a loss (or utility) depending on the true parameter value $$.
8.2.2 1.2 The Decision-Theoretic Setup
Parameter: \(\theta \in \Theta\)
Data: \(y\)
Action space: \(\mathcal{A}\)
Loss function: \(L(a,\theta)\)
After observing \(y\), the Bayesian chooses an action \(a(y)\)minimizing posterior expected loss: \[
\rho(a\mid y) = E[L(a,\theta)\mid y] = \int L(a,\theta)\,p(\theta\mid y)\,d\theta.
\]
Interpretation: The Bayes rule shrinks the estimate toward zero (the prior mean), especially for small |y|.
8.2.5 1.5 Decision Rules and Risk
The Bayes risk is the expected loss averaged over data and parameters: \[
r(a) = E[L(a(Y),\Theta)] = \int\!\!\int L(a(y),\theta)\,p(y,\theta)\,dy\,d\theta.
\]
A decision rule minimizing Bayes risk across all priors is admissible (cannot be uniformly improved).
8.2.6 1.6 Example — Hypothesis Testing with 0–1 Loss
set.seed(8)theta_draws <-rnorm(5000, mean=1, sd=1)mean(theta_draws >0) # posterior probability of H1
[1] 0.8406
8.3 Lecture 2 — Applications and Extensions
8.3.1 2.1 Bayesian Credible Intervals as Decision Regions
For a given loss that penalizes excluding the true parameter,
a credible interval minimizing posterior expected loss corresponds to the shortest interval containing a fixed posterior probability (e.g. 95%).
Utility \(U(a,\theta)\)is simply the negative of loss.
Maximizing expected utility is equivalent to minimizing expected loss: \[
a^*(y) = \arg\max_a E[U(a,\theta)\mid y].
\]This framing is often used in economics and decision analysis.
8.3.4 2.4 Connection to Frequentist Estimation
Under certain priors and symmetric losses, Bayes rules coincide with frequentist estimators (e.g. posterior mean = MLE for flat priors).
Bayesian decision theory thus generalizes classical estimation.
8.3.5 2.5 Example — Optimal Cutoff for a Diagnostic Test
Let \(\theta\)denote disease presence (1 = disease).
If false negatives cost 5× more than false positives, the optimal threshold satisfies \[
\frac{p_1}{p_0} > \frac{1}{5} \;\Rightarrow\; p_1 > 0.17.
\]