21 Lecture 20, Feburary 28, 2024
21.1 Variance of Poisson, Hypergeometric and Negative Binomial
Last time, we saw that if \(X \sim Bin(m,p)\), then \(\mathbb{V}ar(X) = np(1-p)\). We proved this using the definition of the expectation and with the summation trick.
Similarly, one can show that
- If \(X\sim Poi(\lambda)\), then
\[ \mathbb{V}ar(X) = \lambda. \]
If \(Y \sim hyp(N,r,n)\), then \[ \mathbb{V}ar(Y) = n \frac{r}{N} \left(1-\frac{r}{N}\right)\left(\frac{N-n}{N-1}\right). \]
If \(Z \sim NB(k,p)\), then \[ \mathbb{V}ar(Z) = \frac{k(1-p)}{p^2}. \]
21.2 Standard Deviation
Note that \(\mathbb{V}ar(X)\) is in the squared unit (e.g., \(X\) in \(meters\) \(\Rightarrow\) \(\mathbb{V}ar(X)\) is in \(meters^2\)). To recover the original unit, we take the square root of variance.\
Definition 21.1 (Standard Deviation) The standard deviation of a random variable \(X\) is denoted \(SD(X)\), and defined by \[ SD(X) = \sqrt{\mathbb{V}ar(X)}. \]
21.3 Last note of the chapter
The expectation and the variance give a simple giving the center and variability of the distribution
We call \(E[X]\) and \(E[X^2]\) the first and second moment of \(X\)
In general, \(E[X^k]\) is the \(k\)th moment of the distribution of \(X\), while \(E[ (X-E(X))^k]\) is the \(k\)th central moment of the distribution of \(X\)
You’ll see other statistics later in STAT 231 and onwards, such as
Skewness (measures asymmetry) \[ E\left[\left( \frac{(X - E(X))}{\sqrt{\mathbb{V}ar(X)}} \right)^3\right]. \]
Kurtosis (measures heavy tailedness) \[ E\left[\left( \frac{(X - E(X))}{\sqrt{\mathbb{V}ar(X)}} \right)^4\right]. \]
21.4 Chapter 8 Continuous Random Variables
21.4.1 Continuous random variable
Let \(X\) be a random variable and \(F_X(x) = P(X\leq x) = P(\{\omega\in S : X(\omega)\leq x\})\) for \(x\in\mathbb{R}\) be its cumulative distribution function (cdf).
We say that the random variable \(X\) is
discrete if \(F_X\) is piecewise constant.
- The jumps of \(F\) are exactly the range of \(X\), \(X(S)\). For \(x\in X(S)\) (at the jumps of \(F\)),
- the probability function is \(f(x)=P(X=x)=\lim_{h\downarrow 0} F(x+h)-F(x)=\text{size of jump at $x$}\).
continuous if \(F_X\) is a continuous function.
absolutely continuous if \[ F_X(x) = \int_{-\infty}^x f(t) dt\]
In this course, when talking about continuous random variables, we mean absolutely continuous. ### Probability density function
Definition 21.2 (Probability Density Function) We say that an continuous random variable \(X\) with distribution function \(F\) admits probability function (PDF) \(f(x)\), if
- \(f(x)\geq 0\) for all \(x\in\mathbb{R}\);
- \(\int_{-\infty}^\infty f(x)dx = 1\);
- \(F(x)=P(X\leq x)= \int_{-\infty}^x f(t)\;d t\).
In other words, \(F\) is an antiderivative of \(f\), of \(f\) is the derivative of \(F\), \[ f(x) = F'(x) = \frac{d}{dx} F(x)\]
Definition 21.3 (Support) The support of a r.v. \(X\) with density \(F\) is the set \[ supp(f) = \{x \in \mathbb{R}: f(x) \neq 0\}. \]
If \(X\) was a discrete random variable instead, these 4 probabilities could all be different. If \(X\) is a continuous rv with probability density function (pdf) \(f_X(x)\), then
- \(P(X=x)=0\)
- \(F(x) = \int_{-\infty}^x f_X(t)dt\)
- \(P(a < X \leq b) = \int_a^b f_X(t)dt\)
We highlight: For a continuous random variable \(X\), \(f(x)\) is not \(P(X=x)\), which is always zero.
21.4.2 Equality does not matter in the continous case
If \(X\) is a continuous random variable, then \[ P(a<X\leq b) = F(b)-F(a)\] \[ P(a\leq X \leq b) = P(a<X\leq b) +P(X=a)=[F(b)-F(a)]+0\] \[ P(a<X<b)=P(a<X\leq b) -P(X=b)=[F(b)-F(a)]-0\] \[ P(a\leq X<b) = P(a<X\leq b) +P(X=a)-P(X=b)=[F(b)-F(a)]\] so if \(X\) is continuous, all these probabilities coincide!