In this first week, we introduce the mathematical language and statistical framework that will be used throughout the course. Our focus is on the notation of vectors and matrices, vectors of random variables, expectation and covariance operators, and the basic form of the linear regression model.
2.1 Learning Objectives
By the end of this week, students should be able to:
use standard matrix notation for linear statistical models;
distinguish between scalars, vectors, matrices, random variables, and random vectors;
compute expectations and covariance matrices for random vectors;
interpret the linear regression model in matrix form;
understand why projection and least squares will play a central role in this course.
2.2 Reading
Recommended reading for this week:
Seber and Lee, Chapter 1:
1.1 Notation
1.2 Statistical Models
1.3 Linear Regression Models
1.4 Expectation and Covariance Operators
Optional preview:
Chapter 2: Multivariate Normal Distribution
2.3 Why Linear Statistical Analysis?
Linear statistical analysis is one of the central foundations of graduate statistics. Many methods that at first look different are built on the same underlying structure:
regression,
analysis of variance,
analysis of covariance,
prediction,
model comparison,
and parts of generalized linear modelling.
A major goal of this course is to see these topics under a unified framework.
2.4 A unifying point of view
A large part of the course can be summarized by the model
\[
Y = X\beta + \varepsilon,
\]
where
\(Y\) is a response vector,
\(X\) is a design matrix,
\(\beta\) is an unknown parameter vector,
\(\varepsilon\) is a random error vector.
This compact expression contains a great deal of statistical structure. Over the semester, we will study how to estimate \(\beta\), quantify uncertainty, test hypotheses, diagnose model failures, and make predictions.
2.5 Basic Notation
2.6 Scalars, vectors, and matrices
We use the following conventions throughout the course:
scalars are written in lowercase italic letters, such as \(a\), \(b\), \(n\);
vectors are written in bold lowercase letters, such as \(\mathbf{x}\), \(\mathbf{y}\);
matrices are written in bold uppercase letters, such as \(\mathbf{X}\), \(\mathbf{A}\);
random variables are often written in uppercase letters, such as \(Y\);
realizations of random variables are written in lowercase letters, such as \(y\).
This is the most common starting assumption in classical linear regression.
2.12 Statistical Models
A statistical model is a set of probability distributions that may plausibly describe the data-generating mechanism.
2.12.1 General idea
Suppose we observe data \(y\) from a random quantity \(Y\). A model introduces assumptions about the distribution of \(Y\), often indexed by an unknown parameter \(\theta\).
For example:
\[
Y \sim N(\mu, \sigma^2)
\]
with unknown parameters \(\mu\) and \(\sigma^2\).
In regression, the model is not only about the marginal distribution of the response but also about how the mean changes with explanatory variables.
2.12.2 Deterministic part and random part
A useful way to think about a statistical model is:
. Why is it helpful to write regression models in matrix form rather than only scalar notation? 2. What does the covariance matrix tell us that separate variances do not? 3. What is the interpretation of the column space of \(\mathbf{X}\)? 4. In what sense is regression a projection problem?
6 Practice Problems
Conceptual 1. Explain the difference between a random variable and a random vector. 2. Explain why \(\mathrm{Var}(\mathbf{Y})\) must be a symmetric matrix. 3. Give an example of a statistical model outside regression.
write out the matrices \(\mathbf{Y}\), \(\mathbf{X}\), \(\boldsymbol{\beta}\), and \(\boldsymbol{\varepsilon}\) for \(n=5\) observations.
6.1 Suggested Homework
Complete the following:
review matrix multiplication and transpose rules;
derive \(\mathbb{E}[\mathbf{A}\mathbf{Y}+\mathbf{b}]\) from first principles;
derive \(\mathrm{Var}(\mathbf{A}\mathbf{Y})\) using the definition of covariance;
write the simple linear regression model in matrix form for a dataset of your choice;
fit a simple regression in R and report:
the estimated coefficients,
fitted values,
residuals,
and a scatterplot with the fitted line.
6.2 Summary
This week introduced the notation and basic probabilistic tools needed for the rest of the course. We defined random vectors, mean vectors, covariance matrices, and the matrix form of the linear regression model. These ideas will support everything that follows.
Next week, we will study least squares estimation and the geometry of projection in more detail.
For some optional review of the matrix algebra, see Chapter 14