PCS NOTES M1 (1)
PCS NOTES M1 (1)
1.1 Probability
Laplace also said, "Probability theory is nothing but common sense reduced to calculation."
Probability has grown to be one of the most essential mathematical tools applied in diverse fields like economics,
commerce, physical sciences, biological sciences and engineering. It is particularly important for solving practical
electrical-engineering problems in communication, signal processing and computers.
Probabilistic models are established from observation of a random phenomenon. While probability is concerned
with analysis of a random phenomenon, statistics help in building such models from data.
Many of the physical quantities are random in the sense that these quantities cannot be predicted with certainty and
can be described in terms of probabilistic models only.
For example,
The outcome of the tossing of a coin cannot be predicted with certainty. Thus the outcome of tossing a coin is
random.
The number of ones and zeros in a packet of binary data arriving through a communication channel cannot be
precisely predicted is random.
The ubiquitous noise corrupting the signal during acquisition, storage and transmission can be modeled only
through statistical analysis. How to Interpret Probability Mathematically, the probability that an event will occur is
expressed as a number between 0 and 1.
Notationally, the probability of event A is represented by P (A).
If P (A) equals zero, event A will almost definitely not occur.
If P (A) is close to zero, there is only a small chance that event A will occur.
If P (A) equals 0.5, there is a 50-50 chance that event A will occur.
If P(A) is close to one, there is a strong chance that event A will occur.
If P(A) equals one, event A will almost definitely occur. In a statistical experiment, the sum of probabilities for
all possible outcomes is equal to one. This means, for example, that if an experiment can have three possible
outcomes (A, B, and C), then P(A) + P(B) + P(C) = 1.
It is also known as a posteriori probability, i.e., the probability determines after the event.
Consider two events A and B of the random experiment. Suppose we conduct ‘n’ independent trails of this
experiment and events A and B occurs in n(A) and n(B) trails vice versa.
When A and B are mutually exclusive event
∴ P(A ∪ B) = P(A + B) = P(A) + P(B)
When they are not mutually exclusive
P(A ∪ B) = P(A) + P(B) − P(A ∩ B) P(A + B) = P(A) + P(B) − P(AB)
Example:1) An experiment is repeated number of times as shown in below. Find the probability of each event.
Random Experiment Getting Head
1 1
10 6
100 50
Solution: Relative frequency:
Example 2): Suppose a coin is flipped 3 times. What is the probability of getting two tails and one head?
Solution: For this experiment, the sample space consists of 8 sample points.
S = {TTT, TTH, THT, THH, HTT, HTH, HHT, HHH}
Each sample point is equally likely to occur, so the probability of getting any particular sample point is 1/8. The
event "getting two tails and one head" consists of the following subset of the sample space.
A = {TTH, THT, HTT}
The probability of Event A is the sum of the probabilities of the sample points in A.
Therefore, P(A) = 1/8 + 1/8 + 1/8 = 3/8
Fig.1 Illustration of the relationship between sample space, events and probability.
2. If A and B are not two mutually exclusive events, P(A ∪ B) = P(A) + P(B) - P(A ∩ B)
3. If A1, A2, A3,…. Am, are mutually exclusive events, then P(A1) + P(A2)+ ….+ P(Am)=1
Example: 2). If two coins tossed simultaneously, determine the probability of obtaining exactly two heads.
Solution: Number of sample points = 2 × 2 = 4
S = {(T, T),(T, H),(H, T),(H, H)}; P(getting two heads) = 1/ 4
Venn Diagrams:
Pictorial representations of sets represented by closed figures are called set diagrams or Venn diagrams.
Venn diagrams are used to illustrate various operations like union, intersection and difference.
1.4 CONDITIONAL PROBABILITY
Suppose in a random experiment or signal that has been characterized by two random variables, A and B, which are
not independent. Then knowing the value of one random variable, say A, would influence the values observed for
the other random variable B.
Conditional probability of B with respect to A: It is the probability of the event B, under the condition that the
event A happens. That means, conditional probability represents the probability of B occurring, given that A has
already occurred. The conditional probability P(B|A) can be written in terms of the joint probability P(AB) and the
probability of the event P(A).
P(B|A) means "Event B, given Event A = the probability of occurrence of B when A has already occurred .
.
P(A|B) represents the probability of occurrence of A given B has occurred.
N(A ∩ B) is the number of elements common to both A and B.
N(B) is the number of elements in B, and it cannot be equal to zero.
Let N represent the total number of elements in the sample space.
P(A|B) = P(A ∩ B)/P(B)
Therefore, P(A ∩ B) = P(B) P(A|B) if P(B) ≠ 0
= P(A) P(B|A) if P(A) ≠ 0
Similarly, the probability of occurrence of B when A has already occurred is given by,
Here, S denotes a sample space, X denotes a random variable and the condition is number of heads.
Probability Distribution Function (CDF) may be recovered from the density function fX(x) by integration
Uniform Distribution Function: A random variable X is said to be uniformly distributed over the interval (a,b)
if its PDF is
Fig. 2 The uniform distribution (a) The probability density function (b) The distribution function
As PDF is non-decreasing and FX (∞) = 1.This implies that the total area under the curve of the density function
is unity.
The joint distribution function FX ,Y (x,y) is a monotone non-decreasing function of both x and y. Hence, the joint
probability density function fx,Y (x,y) is always nonnegative. Also the total volume under the graph of a joint
probability density function must be unity, as shown by
If X and Y are independently distributed then P(Y= y, X= x) = P(Y= y) P(X= x) the joint distribution equals the
product of the marginal distributions.
and CDF
Indeed, this Eq. may be viewed as generalizing the concept of expected value to an arbitrary function g(X) of
a random variable X.
1.9 MOMENTS
Moments are parameters (special case of g(X) = Xn) that are important in the characterization of probability
distributions and probability density functions.
Two types: (1) Moment about origin (2) Moment about mean
The correlation between random variables X and Y measured by the covariance, is given by
Let σx2 and σY2 denote the variances of X and Y, respectively. Then the covariance of X and Y, normalized with
respect to σX σY,is called the correlation coefficient of X and Y:
The two random variables X and Y are uncorrelated if and only if their covariance is zero, that is, if and only if
cov[XY] = 0
Also they are orthogonal if and only if their correlation is zero, that is, if and only if
E[XY] = 0
Moments of a Bernouli Random Variable
A Bernoulli trial is an experiment that has two possible outcomes, a success and a failure. Consider the coin-
tossing experiment where the probability of a head is p. Let X be a random variable that takes the value 0 if the
result is a tail and 1 if it is a head. We say that X is a Bernoulli random variable.
EASWARA. M, CBIT, KOLAR 12
Module-1: Random Variables and Processes (BEC402)
The probability mass function of a Bernoulli random variable is
Mean, auto-covariance, and auto-correlation functions are statistical averages are used to describe the random
variable.
Mean function: (Stationary of first order) - Mean of a random process is a constant.
mX(t) is a function of time. It specifies the average behavior of X(t) over time. E denotes statistical expectation
operator. Expectation provides a description of the random variable in terms of a few parameters instead of
specifying the entire distribution function or the density function. It is far easier to estimate the expectation {E} of
a random variable from data than to estimate its distribution.
Variance function: The variance of a random variable is an estimate of the spread (dispersion) of the probability
distribution about the mean. The variance is also known as the standard deviation. If the values tend to be
concentrated near the mean, the variance is small; if the values tend to be distributed far from the mean, then the
variance is large (see fig. (a)). For discrete random variables, the variance, , is given by the expectation of the
squared distance of each outcome from the mean value of the distribution.
Fig.(a) Illustration of variance for small and large values. (b) Autocorrelation function of fluctuating random process.
ACF is a second order random process depends only on time difference: t2 - t1.
RX(t1, t2) is defined as the correlation between the two time samples Xt1 and Xt2 , then
Fig. Random signals with different frequency content and their autocorrelations
Auto-covariance function:
Auto-covariance is similar to autocorrelation. Aut-covariance is the autocorrelation of the time-varying part of a
signal. Cx(t1,t2) is defined as the covariance between the two time samples Xt1 and Xt2
Ergodicity: If all of the sample functions of a random process have the same statistical
properties the random process is said to be ergodic. The most important consequence of
ergodicity is that ensemble moments can be replaced by time moments.
For the special case when the Gaussian random variable Y is normalized to have a mean = zero and a variance =1,
as shown by
The central limit theorem states that the probability distribution of VN approaches a normalized Gaussian
distribution, N"(0,1) in the limit as N approaches infinity.
PROPERTY 1: If a Gaussian process X(t) is applied to a LTI filter, then the random process Y(t) developed at the
output of the filter is also Gaussian.
We assume that X(t) is a Gaussian process. The random processes Y(t) and X(t) are related by the convolution
integral.
PROPERTY 2: The mean and autocorrelation functions completely characterize a Gaussian random process.
Consider the set of random variables or samples X(t1), X(t2 ), X( tn), obtained by observing a random process X(t) at
times t1, t2,….tn., If the process X(t) is Gaussian, then this set of random variables is jointly Gaussian for any n, with
their n-fold joint probability density function being completely determined by specifying the set of means
PROPERTY 3: Gaussian wide-sense stationary (WSS) processes are stationary in the strict sense.
PROPERTY 4: If the random variables X(t1), X(t2 ), X(tn), obtained by sampling a Gaussian process X( t) at
times t1 t2, ..tm are uncorrelated, that is,