Optimal Regression Design
Introduction to Optimal Regression Design
In statistical modeling, especially in regression analysis, we often face the question:
“Where and how should we collect data to obtain the most accurate parameter estimates?”
This question is answered by Optimal Experimental Design, a field that seeks to identify the best sampling scheme under a specified statistical model.
🎯 General Objective
Let the model be a linear regression:
\[ Y_i = g(x_i;\boldsymbol{\beta}) + \varepsilon_i,\quad \varepsilon_i \sim \mathcal{N}(0, \sigma^2) \] where:
- Note \(\boldsymbol{\beta}\in \mathbb{R}^p\) are the parameters to estimate,
- \(g(\cdot)\) is mean function,
- \(x_i \in \mathcal{X}\) are the design points in a compact design space.
The Fisher Information Matrix for a weighted design \(\xi = \{(x_i, w_i)\}\) is:
\[ M(\xi) = \int_\mathcal{X}\frac{\partial g}{\partial \boldsymbol{\beta}}\frac{\partial g}{\partial \boldsymbol{\beta}^\top} \mu(d\xi), \]
We wish to choose design weights \(w_i \geq 0\), \(\sum w_i = 1\), to maximize information about \(\boldsymbol{\beta}\).
🧮 Common Optimality Criteria
Criterion | Objective | Interpretation |
---|---|---|
\(D\)-optimality | Maximize \(\det M(\xi)\) | Minimize volume of confidence ellipsoid |
\(A\)-optimality | Minimize \(\operatorname{tr}\{M(\xi)^{-1}\}\) | Minimize average variance of estimates |
\(E\)-optimality | Maximize \(\lambda_{\min}\{M(\xi)\}\) | Minimize worst-case variance direction |
Each criterion reflects a different goal:
- \(D\)-opt aims for overall information efficiency.
- \(A\)-opt targets average estimation precision.
- \(E\)-opt guards against poorly estimated directions.
There are other criteria that are more complex, for example, the minimax criterion. If the model is mis-specified, then the OLSE would be biased. In this case, we would have \(bias(\hat{\boldsymbol{\beta}})=E[\hat{\boldsymbol{\beta}}]-\boldsymbol{\beta}\). Then the minimax D- and A-optimality are \[ \min_{w \in \mathbb{R}^n} \max_{\boldsymbol{\beta}\in \mathbb{R}^p} \log\det\{MSE(\hat{\boldsymbol{\beta}})\} \] and \[ \min_{w \in \mathbb{R}^n} \max_{\boldsymbol{\beta}\in \mathbb{R}^p} \operatorname{tr}\{MSE(\hat{\boldsymbol{\beta}})\}. \]
There are other criteria, such as \(c\)-optimality (minimizing the variance of a linear combination of parameters), \(G\)-optimality (minimizing the maximum variance across all parameter estimates), Bayesian optimal design, \(K\)-optimality (minimizing the condition number), and others.
⚙️ Problem Statement
The optimal design problem is thus:
\[ \begin{aligned} &\max_{w \in \mathbb{R}^n} && \Phi\{M(w)\} \\ &\text{s.t.} && w_i \geq 0,\quad \sum_{i=1}^n w_i = 1 \end{aligned} \]
where \(\Phi\) is one of the optimality criteria (e.g., \(\lambda_{\min}(M)\) for \(E\)-optimality).
This leads to a convex optimization problem, especially suitable for numerical methods using packages such as CVXR
.
In the following sections, we demonstrate how to solve this optimization problem in R, using \(E\)-optimality as a case study.
TODO
- Add examples
- Add SLSE
- Add more optimality criteria
- Equivalence theorem
Reference
- Pukelsheim, F. (2006). Optimal Design of Experiments: A Case Study Approach. SIAM.
- Atkinson, A. C., & Donev, A. N. (1992). Optimum Experimental Designs. Oxford University Press.
- Kiefer, J. (1959). Optimum Experimental Designs. Journal of the Royal Statistical Society: Series B (Methodological), 21(2), 272-304.