2 Optimization
The optimization plays an important role in statistical computing, especially in the context of maximum likelihood estimation (MLE) and other statistical inference methods. This chapter will cover various optimization techniques used in statistical computing.
For instance, for a linear regression \[ y = X\boldsymbol{\beta}+ \varepsilon. \] From regression class, we know that the (ordinary) least-squares estimation (OLE) for \(\boldsymbol{\beta}\) is given by \(\hat{\boldsymbol{\beta}}=(X^\top X)^{-1} X^\top y\). It is convenient as the solution is in the closed-form! However, in the most case, the closed-form solutions will not be available.
For GLMs or non-linear regression, we need to do this iterativelly!
2.1 Speed comparison
set.seed(2017-07-13)
<- matrix(rnorm(5000 * 100), 5000, 100)
X <- rnorm(5000)
y library(microbenchmark)
microbenchmark(solve(t(X) %*% X) %*% t(X) %*% y)
Warning in microbenchmark(solve(t(X) %*% X) %*% t(X) %*% y): less accurate
nanosecond times to avoid potential integer overflows
Unit: milliseconds
expr min lq mean median uq
solve(t(X) %*% X) %*% t(X) %*% y 28.57097 29.23575 30.04752 29.87713 30.62448
max neval
32.21641 100
microbenchmark(solve(t(X) %*% X) %*% t(X) %*% y,
solve(crossprod(X), crossprod(X, y)))
Unit: milliseconds
expr min lq mean median
solve(t(X) %*% X) %*% t(X) %*% y 28.39886 29.03231 29.73657 29.41512
solve(crossprod(X), crossprod(X, y)) 24.96937 25.01504 25.25663 25.05457
uq max neval
30.23379 32.94231 100
25.16457 28.50168 100
2.2 Type of Optimization Algorithms
There are in general two types of the optimization algorithms: (1). deterministic and (2). metaheuristic. Deterministic and metaheuristic algorithms represent two distinct paradigms in optimization. Deterministic methods, such as gradient descent, produce the same solution for a given input and follow a predictable path toward an optimum. In contrast, metaheuristic approaches—like genetic algorithms—incorporate randomness and do not guarantee the best possible solution. However, they are often more effective at avoiding local optima and exploring complex search spaces.
2.3 Heuristic Algorithms
Many of the heuristic algorithms are inspired by the nature, such as the genetic algorithm (GA) and particle swarm optimization (PSO). These algorithms are often used for complex optimization problems where traditional methods may struggle to find a solution. Some of the popular heuristic algorithms include:
- Genetic Algorithm (GA)
- Particle Swarm Optimization (PSO)
- Simulated Annealing (SA)
- Ant Colony Optimization (ACO)
2.4 Deterministic Algorithms
Numerical approximation, what you learned in the mathmeatical optimization course. Some of the algorithms include:
- Gradient Descent
- Newton’s Method
- Conjugate Gradient Method
- Quasi-Newton Methods (e.g., BFGS)
- Interior Point Methods
They often reply on the KKT conditions.
Examples are borrowed from the following sources:
- Peng, R.D. Advanced Statistical Computing.