0% found this document useful (0 votes)
26 views16 pages

Review Notes - Probability

Uploaded by

Vaishnavi Bisen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views16 pages

Review Notes - Probability

Uploaded by

Vaishnavi Bisen
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

MTL106 Probability Review Notes

Viraj Agashe
December 2021

Contents
1 Probability Basics 2
1.1 Important Terminologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Axiomatic Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.3 Properties of Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Classical Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.5 More Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Random Variable 4
2.1 Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Cumulative Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Types of Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.1 Discrete RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.2 Continuous Type RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2.3 Mixed Type RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3 Probability Distribution Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Function of Random Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4.1 Distribution of Y . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

3 Mean and Variance 6


3.1 Mean/Expectation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.2 Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3.3 Higher Order Moments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.1 n-th Order Moment about Mean . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.2 n-th Order Moment about Origin . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3.3 Relation between Moments about Mean and Origin . . . . . . . . . . . . . . . . 7
3.4 Moment Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4.1 Markov’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4.2 Chebyshev’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.4.3 Jensen’s Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

4 Generating Functions 8
4.1 Probability Generating Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.2 Moment Generating Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
4.3 Characteristic Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Random Vectors 9
5.1 Joint Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
5.2 Joint PDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1
6 Independent Random Variables 10
6.1 Functions of Independent RVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
6.2 i.i.d Random Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

7 Conditional Distributions 10
7.1 Discrete RVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
7.2 Continuous RVs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

8 Functions of Random Variables 10


8.1 Distribution of Function of RV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
8.2 Important Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

9 Covariance and Correlation 11


9.1 Variance Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
9.2 Correlation Coefficient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

10 Limiting Distributions 12
10.1 Convergence in Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
10.2 Convergence in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
10.3 Convergence in rth Moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
10.4 Convergence Almost Surely . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

11 Laws of Large Numbers 13


11.1 Weak Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
11.2 Strong Law of Large Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

12 Central Limit Theorem 13

13 Common Discrete Distributions 13


13.1 Bernoulli Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
13.2 Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
13.3 Geometric Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
13.4 Negative Binomial Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
13.5 Poisson Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

14 Common Continuous Distributions 15


14.1 Uniform Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
14.2 Exponential Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
14.3 Gamma Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
14.4 Normal Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

1 Probability Basics
1.1 Important Terminologies
1. Sample Space (Ω): The set of all possible results of a random experiment.
2. σ-field: A collection F of subsets of Ω which satisfies
(i) ω ∈ F
(ii) If A ∈ F then Ā ∈ F
S∞
(iii) Union of countable elements of F belongs to F, i.e. i=1 Ai ∈ F

2
3. Event: Any element of F
4. Samples: Any element of Ω
5. Borel σ-field (B) on R: The collection of Borel sets on R. Borel Sets are those sets which can be
formed from countable union, countable intersection and relative complement of open intervals.

1.2 Axiomatic Definition


A real valued set function P defined on a sigma field F over σ which satisfies:
1. P (A) ≥ 0 ∀ A ∈ F
2. P (Ω) = 1
3. If A1 , A2 , ... are mutually disjoint events in F then
[ X
P Ai = P (Ai )

Note: The triplet (Ω, F, P ) is called a Probability Space.

1.3 Properties of Probability


1. P (φ) = 0
2. P (Ac ) = 1 − P (A)
3. For any A, B ∈ F, we have

P (A ∪ B) = P (A) + P (B) − P (A ∩ B)

4. If A, B ∈ F and A ⊆ B then P (A) ≤ P (B) and P (B\A) = P (B) − P (A)


5. Let {An } is an increasing (decreasing) sequence of events in F, i.e. An ⊆ An+1 (An+1 ⊆ An ).
Then,
P lim An = lim P (An )
n−→∞ n−→∞

1.4 Classical Definition


1
A probability space (Ω, F, P ) with finite Ω, F = 2Ω and P ({ω}) = |Ω| ∀ ω ∈ Ω is called a Laplace
probability space. Probability of any event is given by,

|A|
P (A) =
|Ω|

1.5 More Concepts


1. Conditional Probability: The probability of the event B under the condition A is defined as

P (A ∪ B)
P (A/B) =
P (B)

2. Independent Events: Two events are independent iff

P (A ∩ B) = P (A)P (B)

3
(i) Pairwise Independent: Sequence of events {Ai } is pairwise independent if P (Ai )P (Aj ) =
P (Ai ∩ Aj ) ∀ i 6= j.
(ii) Mutually Independent: Sequence of events {Ai } is pairwise independent if P (A1 ∩ A2 ∩ ... ∩
An ) = P (A1 )P (A2 )...P (An ).
3. Total Probability Theorem: For mutually disjoint and mutually exhaustive events A1 , A2 , ...An
we have for any event B, X
P (B) = P (B\Ai )P (Ai )
n

4. Baye’s Theorem: For any event B ∈ F with P (B) > 0, for mutually disjoint and mutually
exhaustive events A1 , A2 , ...An we have,
P (Ai )P (B\Ai )
P (Ai \B) = P
n P (B\Ai )P (Ai )

2 Random Variable
Let (Ω, F, P ) be a probability space. A real valued function X : Ω −→ R is said to be a random
variable iff
X −1 {(−∞, x]} ∈ F ∀ x ∈ R
.

Figure 1: Collect the set of outcomes ω ∈ Ω which under the mapping X gives values from (−∞, x].
If this lies in F then X is a random variable.

2.1 Distribution Function


A real valued function F satisfying:
1. 0 ≤ F (x) ≤ 1 ∀ x ∈ R
2. F is monotonically increasing function.
3. limx→−∞ F (x) = 0 and limx→∞ F (x) = 1
4. F(x) is right continuous.

2.1.1 Cumulative Distribution Function


We define the distribution function FX for a probability space (Ω, F, P ) and random variable X as,
FX (x) = P (X ≤ x), x ∈ R
Alternately,
FX (x) = P (ω ∈ Ω | X(ω) ≤ x)

4
2.2 Types of Random Variables
2.2.1 Discrete RV
If the CDF of the Random Variable (RV) has countable no. of (left) discontinuities then it is a discrete
RV. For a discrete RV, the CDF is given by:
X
FX (x) = P (X = xi )
xi ≤x

The probability mass function (PMF) of a discrete RV is defined as, P(x) = P (X = xi ) when x = i
and 0 at other points. Properties:
1. P(x) ≥ 0 ∀ x ∈ R
P
2. x P(x) = 1

2.2.2 Continuous Type RV


If the CDF of the RV is continuous in x then it is of continuous type. We represent the CDF as,
Z x
F (x) = f (t)dt
−∞

The function f is called the probability density function of the continuous type RV. Note that
f (x) = F 0 (x). It satisfies:
1. f (t) > 0 ∀ t ∈ R
R∞
2. −∞ f (t)dt = 1

2.2.3 Mixed Type RV


If a RV is continuous in some intervals and has countable no. of discontinuities as well.

2.3 Probability Distribution Relations


1. P (a < X ≤ b) = F (b) − F (a)
2. P (a ≤ X ≤ b) = F (b) − F (a) + P (X = a)
3. P (a < X < b) = F (b) − F (a) − P (x = b)

4. For a continuous type RV,


Z b
P (a ≤ X ≤ b) = f (t)dt
a

5. For a small interval (x, x + ∆x),

P (x ≤ X ≤ x + ∆x) ≈ f (x)∆x

2.4 Function of Random Variable


A function Y defined as, Y = g(X) is a random variable if g is a Borel measurable function. (Note
that every continuous and piecewise continuous function is Borel measurable.)

5
2.4.1 Distribution of Y
We can find the distribution of the random variable Y as follows:

FY (y) = P (Y ≤ y) = P (g(X) ≤ y)

From here, we may determine the distribution of Y . Note that if g is strictly monotonic and differen-
−1
tiable then the following holds: fY (y) = fX g −1 (y) dg dy(y) , where y = g(X), and 0 otherwise.

3 Mean and Variance


3.1 Mean/Expectation
P
Let (Ω, F, P ) be a probability space. If X is discrete type, and i |Xi |P (X = xi ) is finite, then X
has an expected value given by, X
E(X) = Xi P (X = xi )
i
R∞
If X is continuous type with PDF f and |x|f (x)dx is finite, then the expected value is given by:
−∞
Z ∞
E(X) = xf (x)dx
−∞

Properties:
1. E(c) = c, where c is a constant.
2. E(aX + b) = aE(X) + b
3. If P (X ≥ 0) = 1, E(X) ≥ 0 if it exists.
4. If X is continuous type with the CDF F (X) then E(X) is given by,
Z ∞ Z 0
E(X) = (1 − F (x)) dx − (F (x)) dx
0 −∞

5. If X is discrete type, then,


n
X
E(X) = (1 − FX (k)))
k=0

3.2 Variance
Let X be a random variable with E(X) = µ exists. The second order moment about mean is called
variance, defined as,
V ar(X) = E (x − µ)2

On simplifying we also get,


2
V ar(X) = E(X 2 ) − (E(X))
Properties:
1. V ar(c) = 0 where c is a constant.
2. If V ar(X) exists then V ar(X) ≥ 0.
3. If P (X = α) = 1 then E(X) = α and V ar(X) = 0.
4. V ar(aX + b) = a2 V ar(X)

6
3.3 Higher Order Moments
3.3.1 n-th Order Moment about Mean
For a RV X for which n-th order moments about mean exist, we denote by

µn = E ((x − µ)n ) , n = 1, 2, 3...

3.3.2 n-th Order Moment about Origin


For a RV X for which n-th order moments about origin exist, we denote by

µ0n = E (xn ) , n = 1, 2, 3...

Note that if µ0n exists then µ0r also exists for all r < n.

3.3.3 Relation between Moments about Mean and Origin


If X is an RV for which n-th order moments exist then,
n
X n
µn = (µ0k )(−µ01 )n−k
k
k=0

3.4 Moment Inequalities


3.4.1 Markov’s Inequality
Let X be a non-negative RV and E(X) exists. For all t > 0 we have,

E(X)
P (X ≥ t) ≤
t

3.4.2 Chebyshev’s Inequality


Let X be a RV with E(X) = µ and V ar(X) = σ 2 exists. Then for > 0 we have,

E (x − c)2
P (|X − c| ≥ ) ≤
2
In particular, when we have c = µ then,

σ2
P (|X − µ| ≥ ) ≤
2

3.4.3 Jensen’s Inequality


Let X be a RV with E(X) exists. Let g be any convex function. Then we must have,

E(g(X)) ≥ g(E(X))

7
4 Generating Functions
4.1 Probability Generating Function
Let X is a non-negative integer valued RV with Pk = P (X = k) then we can define the PGF as,

X
GX (s) = Pk sk
k=0

Remarks:
1. Converges in s ∈ (−1, 1)
2. Gx (s) = E(sX )
(r)
3. GX (s) represents the r-th derivative of GX . Evaluating it at zero gives factorial moments of
r-th order, i.e.
(r)
GX (1) = E (x(x − 1)...(x − r + 1))
4. We can find the probabilities from the k-th derivatives of the PDF, i.e.
1 (k)
P (X = k) = G (0)
k! X
5. If X and Y have the same PGF for all s then X and Y have the same distribution.

4.2 Moment Generating Function


Let X is a RV such that E(etx ) exists for t in some interval including zero. Then,
MX (t) = E(etx )
If X is a discrete RV then we have,
X
MX (t) = P (X = k)ekt
k

Remarks:
1. One can get the n-th order moments about the moment from the n-th derivative of the MGF
at origin, i.e.
(n)
E(X n ) = MX (0)
2. If X and Y have the same MGF for all t then X and Y have the same distribution.

4.3 Characteristic Function


Let X is a random variable. Characteristic function is defined as
ψX (t) = E(eitx )
Characteristic function is complex valued, always exists. Remarks:
1. ψX (0) = 1
2. |ψX (t)| ≤ 1 ∀t
3. If E(X n ) exists then
1 (n)
E(X n ) = ψ (0)
in X
4. If X and Y have the same characteristic function t then X and Y have the same distribution.

8
5 Random Vectors
A collection of n random variables (X1 , X2 , ...Xn ) over a probability space (Ω, F, P ) is called a random
vector.

5.1 Joint Distribution


For a 2-D random vector, the joint CDF is given by:

F (x1 , x2 ) = P (X1 ≤ x1 , X2 ≤ x2 ) , x1 , x2 ∈ R

It satisfies:
1. F (x1 , x2 ) is non-decreasing, continuous from the right w.r.t each of the coordinates (x1 , x2 ).
2. When x1 →
− ∞, x2 →
− ∞ then F (x1 , x2 ) →
− 1.
3. When x1 →
− −∞ or x2 →
− −∞ then F (x1 , x2 ) →
− 0.
4. For every (a, c), (b, d) s.t. a < b and c < d we have,

F (b, d) − F (b, c) − F (a, d) + F (a, c) > 0

The distribution of one of the random variables constituting a random vector is called a marginal
distribution. From a joint CDF F (x1 , x2 ), we can obtain the marginal distribution of X1 as,

FX1 (x1 ) = lim F (x1 , x2 )


x2 →
−∞
For discrete type random variables, the joint PMF is similarly defined and satisfies,
1. P (x1 , x2 ) ≥ 0, P (x1 , x2 ) ≤ 1
P P
2. X1 X2 P (x1 , x2 ) = 1

By summing up the PMF over one of the coordinates, we can get the marginal distribution of the
other random variable, i.e. X
PX1 (x1 ) = P (x1 , x2 )
X2

5.2 Joint PDF


Let X, Y be continuous type RVs with joint distribution F . Then the joint PDF is defined as,
∂2F ∂2F
f (x, y) = =
∂x∂y ∂y∂x
Or, in another form - Z y Z x
F (x, y) = f (t, s)dtds
−∞ −∞
We can find the marginal PDF from the joint PDF by integrating over R over one of the coordinates.
Z ∞
fX (x) = f (x, s)ds
−∞

Properties of joint PDF:


1. f (x, y) ≥ 0 ∀ x, y
R∞ R∞
2. −∞ −∞ f (t, s)dtds = 1

9
6 Independent Random Variables
We say that X and Y are independent random variables if and only if

F (x, y) = FX (x)FY (y) ∀ (x, y) ∈ R2

For discrete type random variables, a necessary and sufficient condition is,

P (X = xi , Y = yi ) = P (X = xi )P (Y = yi )

For continuous type random variables, another condition involving PDFs is:

f (x, y) = fX (x)fY (y)

Note that for independent random variables, E(XY ) = E(X)E(Y ). Note that the converse is not
necessarily true.

6.1 Functions of Independent RVs


Let X, Y be independent random variables. Let g, h be Borel-measurable functions (continuous func-
tions are Borel-measurable). Then, g(X) and h(Y ) are also independent random variables.

6.2 i.i.d Random Variables


We say {Xn } is a sequence of independently, identically distributed RVs with common distribution if
each (Xi , Xj ) , i 6= j are independent and the distributions of X1 , X2 , ... are the same.

7 Conditional Distributions
7.1 Discrete RVs
For two random variables X, Y of discrete type, condiitonal PMF of X given Y = y is given by:

P (X = x, Y = y)
PX/Y (x/y) = P (X = x/Y = y) =
P (Y = y)

As long as P (Y = y) > 0

7.2 Continuous RVs


The conditional PDF of X given Y = y is given by,

f (x, y)
fX/Y (x/y) =
fY (y)

For all y with fY (y) > 0.

8 Functions of Random Variables


For any collection of random variables X1 , X2 , ...Xn and a Borel-measurable function g, we have
Y = g(X1 , X2 , ...Xn ) is a random variable.

10
8.1 Distribution of Function of RV
Consider any 2D continuous type RV with joint PDF f (x, y). Define Z = H1 (X, Y ) and W =
H2 (X, Y ). Assuming that H1 , H2 are Borel-measurable, we can solve for the distribution of Z, W
under the assumptions:
• It is possible to solve z = H1 (x, y) and w = H2 (x, y) uniquely for x, y in terms of z, w. Let the
solution be x = g1 (z, w) and y = g2 (z, w).
• The partial derivatives of x, y wrt z, w exist and are continuous.
Then the joint PDF of Z, W can be written as,

fZ,W (z, w) = fX,Y (g1 (z, w), g2 (z, w)) |J(z, w)|

Where, J(z, w) is a normalizing constant given by,


∂x ∂x
J(z, w) = ∂z ∂w
∂y ∂y
∂z ∂w

8.2 Important Results


For some common types of functions of RVs, the distributions can be found out as follows:
1. If Z = X + Y then Z ∞
fZ (z) = f (x, z − x)dx
−∞

2. If U = X − Y then Z ∞
fU (u) = f (u + y, y)dy
−∞

3. If V = XY then Z ∞
v 1
fV (v) = f x, . dx
−∞ x |x|
X
4. If W = Y then Z ∞
fW (w) = f (yw, y).|y|dy
−∞

9 Covariance and Correlation


The covariance of two random variables is defined as

cov(X, Y ) = E ((X − E(X))(Y − E(Y ))) = E(XY ) − E(X)E(Y )

Properties of covariance:

1. cov(aX, Y ) = acov(X, Y )
2. cov(X + Y, Z) = cov(X, Z) + cov(Y, Z)
P P
3. cov( Xi , Y ) = cov(Xi , Y )
4. If X, Y are independent, cov(X, Y ) = 0.

11
9.1 Variance Formula
The variance of a sum of random variables can be expressed as:
X X XX
var( Xi ) = var(Xi ) + cov(Xi , Xj )

9.2 Correlation Coefficient


The correlation coefficient ρ(X, Y ) is defined as,

cov(X, Y )
ρ(X, Y ) =
σX σY
p
Here, σX = var(X). Note that |ρ(X, Y )| ≤ 1.

10 Limiting Distributions
10.1 Convergence in Distribution
Let X1 , X2 , ...Xn be a sequence of random variables with CDF F1 , F2 , ... respectively. We say that
d
{Xn } converges in distribution to X i.e. Xn −
→ X if:

lim Fn (x) = F (x)


n→
−∞
Where F is the CDF of the random variable X.

10.2 Convergence in Probability


p
Let X1 , X2 , ...Xn be a sequence of RVs. We say that Xn −
→ if for any > 0 we have:

lim P (|Xn − X| > ) = 0


n→∞

Note that here we need the RV or at least the distribution of X beforehand.

10.3 Convergence in rth Moment


Let r > 0 and {Xn } be a sequence of random variables defined on a probability space such that
E(|Xn |r ) exists and finite ∀ n. We say that {Xn } converges to X in rth moment if

lim E (|Xn − X|r ) = 0


n→∞

10.4 Convergence Almost Surely


a.s.
Let {Xn } be a sequence of random variables. We say that Xn −−→ X if

P lim Xn = X = 1
n→∞

In other words,
P (A) = P ({ω ∈ Ω|Xn (w) → X}) = 1

12
11 Laws of Large Numbers
11.1 Weak Law of Large Numbers
Let X1 , X2 , ...Xn be a sequence of i.i.d. random variables with mean µ and variance σ 2 . Then for any
> 0 we have,
σ2

X1 + X2 + ...Xn
P −µ > ≤ 2
n n
The proof follows by Chebyshev’s inequality.

11.2 Strong Law of Large Numbers


Let X1 , X2 , ...Xn be a sequence of i.i.d. random variables with mean µ and variance σ 2 . Then the
random variable X̄ defined as
X1 + X2 + ... + Xn
X̄ =
n
a.s.
satisfies X̄ −−→ µ, i.e. converges almost surely to the mean.

12 Central Limit Theorem


Let X1 , X2 , ... be a sequence of random variables with mean µ and variance σ 2 defined on a probability
space. Define P P
Xi − E( Xi )
Zn = p P
var( Xi )
Then for larger n, we have that Zn is approximately standard normal distributed, i.e.
Z z
1 −t2
P (Zn ≤ z) ≈ √ e− 2 dt
−∞ 2π

13 Common Discrete Distributions


13.1 Bernoulli Distribution
Bernoulli distribution is observed while tossing a coin or other simple experiments. PMF:
(
1−p x=0
P (x) =
p x=1

13.2 Binomial Distribution


Binomial distribution PMF:

n k
P (k) = p (1 − p)n−k , k = 0, 1, 2...
k
Properties:
• Denoted by X ∼ B(n, p). n, p are the parameters of the distribution.
• Binomial distribution can be decomposed as a sum of n Bernoulli distributed random variables.
• Mean: E(X) = np
• Variance: var(X) = np(1 − p)
• MGF: M (t) = (pet + 1 − p)n

13
13.3 Geometric Distribution
PMF:
P (k) = (1 − p)k−1 p, k = 1, 2, ...
Properties:
• In a sequence of Bernoulli trials, the 1st success is a geometric distribution.
• Denoted by X ∼ Geo(p).

• Mean: E(X) = 1
p

• Variance: var(X) = 1−p


p2

pet
• MGF: M (t) = 1−(1−p)et

13.4 Negative Binomial Distribution


PMF:
k−1 r
P (k) = p (1 − p)k−r , k = r, r + 1, ...
r−1
Properties:
• In a sequence of Bernoulli trials, the probability of getting r-th success in k-th trial is a NB
distribution.
• Denoted by X ∼ N B(r, p).
• Mean: E(X) = r
p

• Variance: var(X) = r(1−p)


p2
r
pet
• MGF: M (t) = 1−(1−p)e t

13.5 Poisson Distribution


PMF:
e−λ λk
P (k) = , k = 0, 1, 2...
k!
Properties:
• Denoted by X ∼ Poisson(λ).

• Mean: E(X) = λ
• Variance: var(X) = λ
t
• MGF: M (t) = eλ(e −1)

14
14 Common Continuous Distributions
14.1 Uniform Distribution
PDF: (
1
b−a a<x<b
f (x) =
0 otherwise
Properties:

• Denoted by X ∼ U (a, b).


• Mean: E(X) = a+b
2

(b−a)2
• Variance: var(X) = 12

ebt −eat
• MGF: M (t) = t(b−a)

14.2 Exponential Distribution


PDF: (
λe−λx x>0
f (x) =
0 otherwise
Properties:

• Denoted by X ∼ exp(λ).
• Mean: E(X) = 1
λ

• Variance: var(X) = 1
λ2

• MGF: M (t) = 1
t
1− λ
,t < λ

• Exponential distribution satisfies Markov property or memoryless property, i.e.

P (X > t + s/x > s) = P (X > t)

Geometric distribution also has this property.

14.3 Gamma Distribution


PDF:
λr xr−1 e−λx
(
τ (r) x>0
f (x) =
0 otherwise
Properties:
• Denoted by X ∼ Gamma(r, λ).

• It represents sum of r exponentially distributed RVs.


• Mean: E(X) = r
λ

• Variance: var(X) = λr2


r
• MGF: M (t) = 1−1 t ,t < λ
λ

15
14.4 Normal Distribution
PDF:
1 (x−µ)2
f (x) = √ e 2σ2 , x ∈ R
σ 2π
Properties:
• Denoted by X ∼ N (µ, σ 2 ).

• If we consider Z = X−µ
σ then we have Z is standard normal distributed, i.e. Z ∼ N (0, 1) with

1 −z2
f (z) = √ e 2

• Mean: µ
• Variance: σ 2
σ 2 t2
• MGF: M (t) = eµt+ 2

16

You might also like