26 Lecture 24, March 11, 2024
26.1 Sampling realizations of random variables
- In practice, we may often want to sample realizations of random variables, and use those realizations to conduct some estimation of inference (\(\Rightarrow\) Monte Carlo Simulation)
- For instance, suppose we wish to estimate (approximate) the probability \(P(X>2)\) for \(X\sim Exp(1)\) by simulation
- of course, we can compute \(P(X>2)\) by hand, but let’s pretend we can’t.
- In this case, could sample independent realizations \(x_1,\dots,x_n\) from \(Exp(1)\) (numbers that look like realizations \(X(\omega_1),\dots,X(\omega_n)\)), and then estimate \[P(X>2)\approx \frac{|\{x_1,\dots,x_n: x_i>2\}|}{n}\] that is, we estimate \(P(X>2)\) by the relative frequency of observations larger than 2.
26.2 Inversion method
26.2.1 With strictly increasing and continuous assumption
Lemma 26.1 If \(F\) is a continuous and strictly increasing cdf of some random variable \(X\) and if \(U\sim Unif(0,1)\), then the random variable \(Y=F^{-1}(U)\) has cdf \(F\).
Proof. Denote by \(F_Y\) the cdf of the random variable \(Y = F^{-1}(U)\). Then, \[ F_Y(y) = P(F^{-1}(U)\leq x) = P(F(F^{-1}(U))\leq F(y))=P(U\leq F(y)).\] But for any \(u\in[0,1]\), we know that \(P(U\leq u)=u\), since \(U\sim U(0,1)\). Hence, \[ F_Y(x) = P(U\leq F(y))=F(y)\] Hence, the random variable \(Y = F^{-1}(U)\) has the cdf \(F\), as desired.
26.2.2 More general case using the quantile
By using the more general definition of the quantile function \[F^{-1}(y) = \inf\{x\in\mathbb{R}: F(x)\geq y\}\] one can show the following generalization:
Theorem 26.1 Let \(F\) be any cumulative distribution function of some random variable \(X\) and \(U\sim U(0,1)\). Then the random variable \(F^{-1}(U)\) has cdf \(F\).
- No matter what cdf \(F\) (discrete or continuous), we can sample observations as follows:
- Sample \(U\sim U(0,1)\) (eg via )
- Return \(X=F^{-1}(U)\).
- Repeating this \(n\) times independently gives \(n\) realizations from \(F\).
26.3 Normal/Gaussian distribution
Gaussian distribution is named as Gauss, and is perhaps one of the most important distribution, if not the most important one.
Definition 26.1 \(X\) is said to have a (or Gaussian distribution) with mean \(\mu\) and variance \(\sigma^2\) if the density of \(X\) is \[ f(x) = \frac{1}{\sqrt{2\pi \sigma^2}} e^{\frac{-(x-\mu)^2}{2\sigma^2}}, \;\;\; x\in {\mathbb R}. \] We denote it by \(X\sim N(\mu,\sigma^2)\).
26.3.1 Properties of Gaussian distribution
1/ Symmetric about its mean: If \(X \sim N(\mu, \sigma^2)\)
\[ P( X \le \mu - t) = P(X \ge \mu + t). \] 2. Density is unimodal: Peak is at \(\mu\).
Mean and Variance are the parameters: \[E(X)= \mu\] and \[Var(X)=\sigma^2.\]
\(N(\mu, \sigma^2)\) is sometimes (STAT 231) also parametrised as Gaussian distribution using \(\sigma\) instead of \(\sigma^2\), where \[ X \sim G(\mu, \sigma). \] That is, \(X\sim N(1, 4)\) and \(X\sim G(1, 2)\) mean the same thing.
Median = mean = mode = first moment.