In this week, we study the sampling distribution of the ordinary least squares estimator under the normal linear model. This allows us to quantify uncertainty, estimate the error variance, construct confidence intervals, perform hypothesis tests, and distinguish between inference for the mean response and prediction for a future observation.
4.1 Learning Objectives
By the end of this week, students should be able to:
state the normal linear model;
derive the distribution of the OLS estimator;
obtain an unbiased estimator of \(\sigma^2\);
understand the role of chi-square, \(t\), and \(F\) distributions in linear regression;
construct confidence intervals for regression coefficients and mean responses;
perform hypothesis tests for individual coefficients and general linear hypotheses;
distinguish between confidence intervals for the mean response and prediction intervals for a new observation.
4.2 Reading
Recommended reading for this week:
Seber and Lee:
sections on the distribution theory of least squares estimators
estimation of the error variance
inference for regression coefficients
Montgomery, Peck, and Vining:
sections on confidence intervals, hypothesis testing, and prediction in linear regression
Under this assumption, exact sampling distributions can be derived for the OLS estimator, residual sum of squares, and many test statistics.
6.1 Why normality matters
Without normality, OLS is still unbiased under the standard moment assumptions, but exact \(t\) and \(F\) inference generally no longer holds in finite samples.
Normality gives us:
exact distribution of \(\hat{\boldsymbol{\beta}}\);
exact chi-square distribution for the residual sum of squares;
This is special and extremely useful. It is what allows us to replace the unknown \(\sigma\) with \(\hat{\sigma}\) and obtain exact \(t\) and \(F\) distributions.
Intuitively:
\(\hat{\boldsymbol{\beta}}\) depends on the projected part of \(\mathbf{Y}\) onto \(\mathcal{C}(\mathbf{X})\);
\(\mathrm{SSE}\) depends on the orthogonal residual part.
Because these parts are orthogonal and jointly normal, they are independent.
In regression output from summary(lm(…)), the key columns are:
Estimate: the estimated coefficient;
Std. Error: the estimated standard deviation of the estimator;
t value: the test statistic for testing whether the coefficient equals zero;
Pr(>|t|): the corresponding \(p\)-value.
The output also reports:
residual standard error;
degrees of freedom;
\(R^2\) and adjusted \(R^2\);
an overall \(F\) test.
We will discuss the overall ANOVA-style decomposition more formally soon.
17.1 14. In-Class Discussion Questions
Why does normality lead to exact finite-sample inference?
Why do we divide SSE by \(n-p\) rather than \(n\)?
Why are prediction intervals wider than confidence intervals for the mean response?
Why is independence between \(\hat{\boldsymbol{\beta}}\) and SSE so important?
17.2 15. Practice Problems
Conceptual 1. Explain the difference between the sampling distribution of \(\hat{\beta}_j\) and the distribution of \(Y_i\). 2. Explain why \(\hat{\sigma}^2 = \mathrm{SSE}/(n-p)\) is unbiased. 3. Explain why the \(t\) distribution appears instead of the normal distribution.
\(\hat{\boldsymbol{\beta}} = (2, -1)^\top\), and \(\hat{\sigma}^2 = 4\). 1. Find the standard error of \(\hat{\beta}_2\). 2. Construct a confidence interval for \(\beta_2\) using a generic critical value \(t^\star\). 3. For \(x_0 = (1,3)^\top\), compute the estimated variance of the fitted mean. 4. Write down the form of the prediction interval at \(x_0\).
Complete the following tasks: • derive the distribution of \(\hat{\boldsymbol{\beta}}\) under the normal linear model; • prove that \(\hat{\sigma}^2 = \mathrm{SSE}/(n-p)\) is unbiased; • derive the \(t\) statistic for one coefficient; • construct a confidence interval for the mean response at a chosen covariate value; • construct a prediction interval for a future observation at the same covariate value; • fit a regression model in R and interpret all coefficient-level inferential output.
Summary
In this week, we moved from estimation to inference under the normal linear model.