21 Lecture 20, Feburary 28, 2024

21.1 Variance of Poisson, Hypergeometric and Negative Binomial

Last time, we saw that if $X \sim Bin(m,p)$, then $\mathbb{V}ar(X) = np(1-p)$. We proved this using the definition of the expectation and with the summation trick.

Similarly, one can show that

If $X\sim Poi(\lambda)$, then

\[ \mathbb{V}ar(X) = \lambda. \]

If $Y \sim hyp(N,r,n)$, then \[ \mathbb{V}ar(Y) = n \frac{r}{N} \left(1-\frac{r}{N}\right)\left(\frac{N-n}{N-1}\right). \]
If $Z \sim NB(k,p)$, then \[ \mathbb{V}ar(Z) = \frac{k(1-p)}{p^2}. \]

21.2 Standard Deviation

Note that $\mathbb{V}ar(X)$ is in the squared unit (e.g., $X$ in $meters$ $\Rightarrow$ $\mathbb{V}ar(X)$ is in $meters^2$). To recover the original unit, we take the square root of variance.\

Definition 21.1 (Standard Deviation) The standard deviation of a random variable $X$ is denoted $SD(X)$, and defined by \[ SD(X) = \sqrt{\mathbb{V}ar(X)}. \]

21.3 Last note of the chapter

The expectation and the variance give a simple giving the center and variability of the distribution
We call $E[X]$ and $E[X^2]$ the first and second moment of $X$
In general, $E[X^k]$ is the $k$th moment of the distribution of $X$, while $E[ (X-E(X))^k]$ is the $k$th central moment of the distribution of $X$
You’ll see other statistics later in STAT 231 and onwards, such as
- Skewness (measures asymmetry) \[ E\left[\left( \frac{(X - E(X))}{\sqrt{\mathbb{V}ar(X)}} \right)^3\right]. \]
- Kurtosis (measures heavy tailedness) \[ E\left[\left( \frac{(X - E(X))}{\sqrt{\mathbb{V}ar(X)}} \right)^4\right]. \]

21.4 Chapter 8 Continuous Random Variables

21.4.1 Continuous random variable

Let $X$ be a random variable and $F_X(x) = P(X\leq x) = P(\{\omega\in S : X(\omega)\leq x\})$ for $x\in\mathbb{R}$ be its cumulative distribution function (cdf).

We say that the random variable $X$ is

discrete if $F_X$ is piecewise constant.
- The jumps of $F$ are exactly the range of $X$, $X(S)$. For $x\in X(S)$ (at the jumps of $F$),
- the probability function is $f(x)=P(X=x)=\lim_{h\downarrow 0} F(x+h)-F(x)=\text{size of jump at $x$}$.
continuous if $F_X$ is a continuous function.
absolutely continuous if \[ F_X(x) = \int_{-\infty}^x f(t) dt\]

In this course, when talking about continuous random variables, we mean absolutely continuous. ### Probability density function

Definition 21.2 (Probability Density Function) We say that an continuous random variable $X$ with distribution function $F$ admits probability function (PDF) $f(x)$, if

$f(x)\geq 0$ for all $x\in\mathbb{R}$;
$\int_{-\infty}^\infty f(x)dx = 1$;
$F(x)=P(X\leq x)= \int_{-\infty}^x f(t)\;d t$.

In other words, $F$ is an antiderivative of $f$, of $f$ is the derivative of $F$, \[ f(x) = F'(x) = \frac{d}{dx} F(x)\]

Definition 21.3 (Support) The support of a r.v. $X$ with density $F$ is the set \[ supp(f) = \{x \in \mathbb{R}: f(x) \neq 0\}. \]

If $X$ was a discrete random variable instead, these 4 probabilities could all be different. If $X$ is a continuous rv with probability density function (pdf) $f_X(x)$, then

$P(X=x)=0$
$F(x) = \int_{-\infty}^x f_X(t)dt$
$P(a < X \leq b) = \int_a^b f_X(t)dt$

We highlight: For a continuous random variable $X$, $f(x)$ is not $P(X=x)$, which is always zero.

21.4.2 Equality does not matter in the continous case

If $X$ is a continuous random variable, then \[ P(a<X\leq b) = F(b)-F(a)\] \[ P(a\leq X \leq b) = P(a<X\leq b) +P(X=a)=[F(b)-F(a)]+0\] \[ P(a<X<b)=P(a<X\leq b) -P(X=b)=[F(b)-F(a)]-0\] \[ P(a\leq X<b) = P(a<X\leq b) +P(X=a)-P(X=b)=[F(b)-F(a)]\] so if $X$ is continuous, all these probabilities coincide!