SLSEDesign: Optimal designs using the second-order Least squares estimator ================ Chi-Kuang Yeh, Julie Zhou, Jason Hou-Liu
June 04, 2025
Description
This is a package to compute the optimal regression design under the second-order Least squares estimator
Installation
SLSEdesign is now available on CRAN. Hence you may install it by typing
install.packages("SLSEdesign")
or you may download the develop version by typing
devtools::install_github("chikuang/SLSEdesign") # or pak::pkg_install("chikuang/SLSEdesign")
library(SLSEdesign)
Details
Consider a general regression model, where is the -th observation of a response variable at design point , is a design space, is the unknown regression parameter vector, response function can be a linear or nonlinear function of , and the errors are assumed to be uncorrelated with mean zero and finite variance .
Let be an estimator of , such as the least squares estimator. Various optimal designs are defined by minimizing over the design points , where function can be determinant, trace, or other scalar functions. The resulting designs are called optimal exact designs (OEDs), which depend on the response function , the design space , the estimator , the scalar function , and the number of points .
Second order least-squares estimator is defined as
Comparison between ordinary least-squares and second order least-squares estimators
Note that $`W(\mathbf{x}_i)`$ is a $`2\times 2`$ non-negative semi-definite matrix which may or may not depend on (Wang and Leblanc, 2008). It is clear that SLSE is a natural extension of the OLSE which is defined based on the first-order difference function (i.e. $`y_i-\mathbb{E}[y_i]=y_i-\eta(\mathbf{x}_i;\mathbf{\theta})`$). On the other hand, SLSE is defined using not only the first-order difference function, but also second-order difference function (i.e. $`y_i^2-\mathbb{E}[y_i^2]=y_i^2-(\eta^2(\mathbf{x}_i;\mathbf{\theta})+\sigma^2))`$. One might think about the downsides of SLSE after discussing its advantages over OLSE. SLSE does have its disadvantages. It is not a linear estimator, and there is no closed-form solution. It requires more computational resources compared to OLSE due to its nonlinearity. However, numerical results can be easily computed for SLSE nowadays. As a result, SLSE is a powerful alternative estimator to be considered in research studies and real-life applications.
In particular, if we set the skewness parameter to be zero, the resulting optimal designs under SLSE and OLSE will be the same!
Examples
D-optimal design of the 3rd order polynomial regression model
A partial derivative of the mean function is required:
We first calculate the D-optimal design when the skewness parameter t
is set to be zero. The resulting D-optimal design should be the same as the optimal design under the ordinary least-squares estimator.
my_design <- Dopt(N = 31, u = seq(-1, 1, length.out = 31),
tt = 0, FUN = poly3, theta = rep(1, 4), num_iter = 500)
my_design$design
# location weight
# 1 -1.0 0.2615264
# 10 -0.4 0.2373288
# 22 0.4 0.2373288
# 31 1.0 0.2615264
my_design$val
# 5.133616
Now we look at the situation when the skewness parameter t
is in the interval (0, 1], for instance, .
my_design <- Dopt(N = 31, u = seq(-1, 1, length.out = 31),
tt = 0.7, FUN = poly3, theta = rep(1, 4), num_iter = 500)
my_design$design
# location weight
# 1 -1.0 0.2714088
# 10 -0.4 0.2287621
# 22 0.4 0.2287621
# 31 1.0 0.2714088
my_design$val
# 6.27293
Add equivalence theorem plot for D-optimal design:
design = data.frame(location = c(-1, -0.447, 0.447, 1),
weight = rep(0.25, 4))
u = seq(-1, 1, length.out = 201)
plot_dispersion(u, design, tt = 0, FUN = poly3,
theta = rep(0, 4), criterion = "D")
D-optimal design of the 3rd order polynomial regression model without intercept
In the last example, the support points did not change as t
increases. However, it is not always the case, and. the optimal design may be depending on t
.
poly3_no_intercept <- function(xi, theta){
matrix(c(xi, xi^2, xi^3), ncol = 1)
}
my_design <- Dopt(N = 31, u = seq(-1, 1, length.out = 31),
tt = 0, FUN = poly3_no_intercept, theta = rep(1, 3), num_iter = 500)
my_design$design
# location weight
# 1 -1.0 0.3275005
# 7 -0.6 0.1565560
# 25 0.6 0.1565560
# 31 1.0 0.3275005
my_design$val
# 3.651524
my_design <- Dopt(N = 31, u = seq(-1, 1, length.out = 31),
tt = 0.9, FUN = poly3_no_intercept, theta = rep(1, 3), num_iter = 500)
my_design$design
# location weight
# 1 -1.0 0.2888423
# 10 -0.4 0.2096781
# 22 0.4 0.2096781
# 31 1.0 0.2888423
my_design$val
# 4.892601
Reference
- Gao, Lucy L. and Zhou, Julie. (2017). D-optimal designs based on the second-order least squares estimator. Statistical Papers, 58, 77–94.
- Gao, Lucy L. and Zhou, Julie. (2014). New optimal design criteria for regression models with asymmetric errors. Journal of Statistical Planning and Inference, 149, 140-151.
- Wang, Liqun and Leblanc, Alexandre. (2008). Second-order nonlinear least squares estimation. Annals of the Institute of Statistical Mathematics, 60, 883–900.
- Yeh, Chi-Kuang and Zhou, Julie. (2021). Properties of optimal regression designs under the second-order least squares estimator. Statistical Papers, 62, 75–92.
- Yin, Yue and Zhou, Julie. (2017). Optimal designs for regression models using the second-order least squares estimator. Statistica Sinica, 27, 1841-1856.