0% found this document useful (0 votes)
13 views

Probability

Unit 3 covers fundamental concepts of probability including conditional probability, measures of central tendency (mean, median, mode), and standard deviation. It explains the definitions of sample space, events, and the rules of probability, including independent and dependent events, as well as the addition and multiplication rules. Additionally, it introduces basic statistics and their applications in data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views

Probability

Unit 3 covers fundamental concepts of probability including conditional probability, measures of central tendency (mean, median, mode), and standard deviation. It explains the definitions of sample space, events, and the rules of probability, including independent and dependent events, as well as the addition and multiplication rules. Additionally, it introduces basic statistics and their applications in data analysis.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 53

CSE011

UNIT III
UNIT III
Unit 3 Probability COs
A Probability: Conditional Probability; CO3,CO6
B Mean, Median, Mode and Standard CO3,CO6
Deviation;
C Random Variables; Distributions; CO3,CO6
Probability
• Probability is defined as the likelihood of the occurrence of any event.
• Probability is expressed as a number between 0 and 1, where:
• 0 is the probability of an impossible event and 1 is the probability of a sure
event.
Sample Space (Ω/S)
• A Sample Space is the set of all possible outcomes of an experiment
or a random phenomenon. Sample Space is denoted by the symbol
“S” and represents all the possible outcomes that can occur.

• E.g. when flipping a coin, the sample space is {heads, tails}, because those are
the only two possible outcomes.
• Similarly, when rolling a six-sided die, the sample space is {1, 2, 3, 4, 5, 6},
because those are the only possible outcomes.
Event
• An event can be defined as any outcome or set of outcomes from a
random experiment.
• An event in probability is the subset of the respective sample space.
Example:
1. If you roll a die, the event could be “getting a 3” or “getting an even
number.”
2. If you toss two coins simultaneously , the event could be getting
“getting at least 1 heads” or “getting two tails”.
Probability of an Event
• The probability of an event E, denoted by P(E), is a number between 0 and 1
that represents the likelihood of Event E occurring.

• If P(E) = 0, the event E is impossible.


• If P(E) = 1, the event E is certain to occur.
• If 0 < P(E) < 1, the event E is possible but not guaranteed.

• The sum of the probabilities of all events in a sample space is always equal to 1
• In a rolling die experiment
Possible Outcomes : { 1, 2, 3, 4, 5, 6 }
then , P(1) + P(2) + P(3) + P(4) + P(5) + P(6) = 1
Types of Events
Dependent and Independent Events:
Dependent events are those in which the probability of an event
changes based on previous outcomes.
• Example 1: Drawing two cards from a deck without replacement.
If you draw one card and do not replace it, the total number of cards in the deck
changes. The probability of drawing a specific card on the second draw is
affected by the outcome of the first draw, hence they are dependent events.
• Example 2: Picking a marble from a bag, not replacing it, and then picking
another marble.
If the first marble is not replaced, the total number of marbles changes, which
influences the probability of picking the second marble. Hence, the events are
dependent.
• Independent events are those in which the probability of an event
remains the same, regardless of previous outcomes.

Example 1: Flipping a coin twice.


• The outcome of the first flip (heads or tails) does not affect the
outcome of the second flip. The probability of each flip remains 1/2, so
the events are independent.
Example 2: Rolling a die and flipping a coin.
• The result of rolling the die (e.g., getting a 4) has no impact on the
result of flipping the coin (heads or tails). Both events are independent.
Independent Events Dependent Events
Independent events are events that are Dependent events are events that are
not affected by the occurrence of other affected by the occurrence of other
events. events.
The formula for the Independent Events
is, The formula for the Dependent Events is,
P(A∩B)=P(A)⋅P(B) P(A∩B)=P(A)⋅P(B∣A)

Examples of Independent Events are, Examples of Dependent Events are,


Tossing one coin was not affected by the The probability of finding a red ball from a
tossing of other coins box of 4 red balls and 3 green balls
Raining for a day and getting six in dice changes if we take out two balls from the
are independent events. box.
Basic Probability Rules
• Addition Rule: P(A∪B) = P(A) + P(B) – P(A∩B), where A∪B denotes the
union of events A and B.

• Multiplication Rule for Independent Events: P(A∩B) = P(A) × P(B),


where A and B are independent events.

• Complement Rule: P(A ′) = 1 – P(A), where ′ A ′ denotes the


complement of event A.
• The addition rule for probability is a
principle that allows you to calculate
the probability that at least one of
two events will occur.
• It is defined as the sum of the
probabilities of each event, minus
the probability that both events
occur together. This prevents double-
counting the overlap between the
events.
• The General Addition Rule for
Probability is given by P(A or B) = P(A)
+ P(B) – P(A and B) where A and B are
the two events. For mutually
exclusive events, P(A and B) = 0. So
P(A or B) = P(A) + P(B) for mutually
exclusive events.
Non-Mutually Exclusive Events
• Two events, A and B, are said to be non-mutually exclusive if they can
occur simultaneously during a single trial.

• Example: Rolling a die, let A represent rolling an odd number ({1, 3,


5}) and B represent rolling a 3 ({3}). In this case, the number 3 belongs
to both events, meaning A and B overlap.
Venn Diagram: Non-mutually exclusive events are represented
as overlapping circles in a Venn diagram, with the shared outcomes
placed in the intersection.
The two circles representing Event A ({1, 3, 5}) and Event B ({3})
intersect, showing an overlap at 3.
• Explanation: The outcomes 1, 5, and 3 are in one
circle which denotes event A. 3 is common to
both the events, and thus it lies in the
intersection. 4 and 6 do not come in any event,
and thus they lie outside into the sample space.

• Addition Rule: P(A ∪ B) = P(A) + P(B) − P(A ∩ B)


• Here:

• P(A): Probability of rolling an odd number = 3/6


= 1/2,
• P(B): Probability of rolling a 3 = 1/6​,
• P(A ∩ B): Probability of rolling a number that is
both odd and 3 = 1/6.
• Substitute into the formula: P(A ∪ B) = 1/2 + 1/6
− 1/6 = 1/2.
Conditional Probability
• The probability of occurrence of any event A when another
event B in relation to A has already occurred is known as
conditional probability. It is depicted by P(A|B).
𝑷 ( 𝑨 ∩ 𝑩)
𝑷 ( 𝑨∨ 𝑩 )=
𝑷 (𝑩)

P (A ∩ B) represents the probability of both events A and B


occurring simultaneously.

P(B) represents the probability of event B occurring.


Steps to find conditional probability
• Step 1: Identify the Events. Let’s call them Event A and Event B.
• Step 2: Determine the Probability of Event A i.e., P(A)
• Step 3: Determine the Probability of Event B i.e., P(B)
• Step 4: Determine the Probability of Event A and B i.e., P(A∩B).
• Step 5: Apply the Conditional Probability Formula and calculate the
required probability.
Q. Find probability of Rolling a Dice with 3 in the first Roll and 9 as Sum.
Sample space of this event is as follows:
S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6), (3, 1), (3, 2),
(3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6), (5, 1), (5, 2), (5, 3), (5, 4), (5,
5), (5, 6), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
Consider an event A = getting 3 on the first die; B = getting a sum of 9.
Then the probability of getting 9 when its already 3 on the first die P(B|A),
All the cases for the first die as 3 are (3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6).
In all of these cases, only one case has a sum of 9.
Thus, P (B|A) = 1/36.

In case, we have to find P (A | B),


All cases where the sum is 9 are (3, 6), (4, 5), (5, 4), and (6, 3).
In all of these cases, only one case has 3 on the first die i.e., (3, 6)
Conditional Probability and
Independent Events
When the probability of one event happening doesn’t influence the
probability of any other event, then events are called independent, otherwise
dependent events.
When two events are independent, those conditional probability is the same
as the probability of the event individually i.e., P (A | B) is the same as P(A) as
there is no effect of event B on the probability of event A. For independent
events, A and B, the conditional probability of A and B with respect to each
other is given as follows:

P(B|A) = P(B)
P(A|B) = P(A)
Conditional Probability and Bayes’
Theorem
• It provides a mathematical framework for updating beliefs or
hypotheses in light of new evidence or information. This theorem is
extensively used in various fields, including statistics, machine
learning, and artificial intelligence.
• At its core, Bayes’ Theorem enables us to calculate the probability of a
hypothesis being true given observed evidence. The theorem is
expressed mathematically as follows:
Mean, Median, Mode, Standard
Deviation
• Statistics is a Branch of Mathematics, that transforms your data into
useful insights for Decision Makers.
• Statistics is one of the most important disciplines to provide tools and
methods to find structure in and to give deeper insight into data, and
the most important discipline to analyze and quantify uncertainty.
• It’s a backbone for the Hypothesis Testing, Machine Learning, Deep
Learning concepts etc.
Basic Statistics

Measure of Central Measure of


Tendency Variability/Dispersion
Measure of Central Tendency
• A measure of central tendency is one of the Descriptive Statistic that
represents the center point of a dataset (i.e. Describes the data in a
single value by identifying the central position)
Mean
• Mean is one of the measure of
Central Tendency, which gives
the average of the data(i.e. Sum
of all the values divided by Total
number of values)
Python Code for Mean

# Importing packages Mean = sum(data)/len(data)


import statistics print(mean)
import numpy as np 18.375
# Sample Data
data = [1,2,4,5,6,76,8,45] print(np.mean(data))
# Using simple mean 18.375
formula
mean = print(statistics.mean(data))
sum(data)/len(data) 18.375

print(mean)
# output of numpy
package
print(np.mean(data))
# output of statistics
package
print(statistics.mean(data))
Median
• Median gives the
Middle value of the
sorted data. It is mostly
used in Outlier
detection/removal and
imputing missing values
while doing data
preprocessing in the
data.
import statistics
import numpy as np
data = [1,2,4,5,6,76,8,45]
# Using Formula without Python Packages
"""
If number of elements = odd - - - -> n/2
If number of elements = even - - → (n+1)/2
We can't use the (n+1)/2 exactly in coding. Because finding the position using float values gets error. So, I am
slightly changing the formula for Even.
m1 = (n/2)th position
m2 = ((n/2) - 1) th position
"""
sorted_data_median = sorted(data)
print("Sorted Data:", sorted_data_median)
m1 = int(len(sorted_data_median)/ 2)
m2 = int((m1 - 1))
print(f"Position of the data: {m1} and {m2}")
print(f"Values in the Position: {sorted_data_median[m1]} and {sorted_data_median[m2]}")
median = (sorted_data_median[m1] + sorted_data_median[m2])/2
print("Median:", median)

# Using numpy package


print(np.median(data))

# Using statistics package


print(statistics.median(data))
• Finding median without using packages:
If number of elements = odd -------> n/2
If number of elements = even ------> (n+1)/2
We can't use the (n+1)/2 exactly in coding. Because finding the position
using float values gets error. So, I am slightly changing the formula for Even.

m1 = (n/2)th position

m2 = ((n/2) - 1) th position

Median = (m1 + m2)/2

5.5
Mode

• Mode is the most occurring value in the dataset. It is mostly used in


deleting the maximum words occurred in NLP and imputing the
missing values while doing data preprocessing in the dataset.
# Importing packages
import statistics
from scipy import stats
# Sample data
data = [1,2,3,4,4,5,6,7,7,7,7,6,6,4,3,2,1]
# Using Scipy Package
output = stats.mode(data) output = stats.mode(data)
print(f"The Number {output[0]} occured printf"The Number {output[0]} occured
{output[1]} times") {output[1]} times"
# Using Statistics package
print(statistics.mode(data))
The Number [7] occured [4] times
• Using Mean, Median
and Mode, we can
see whether the
distribution is
Skewed or Not(Left
Skewed and Right
Skewed).
Measure of Variance/Dispersion
Population & Sample
• Population : The Population is the Entire
group that you are taking for analysis or
prediction.
• Sample : Sample is the Subset of the
Population(i.e. Taking random samples
from the population). The size of the
sample is always less than the total size
of the population.
Measure of Variance/Dispersion
• A Measure of variability is one of the Descriptive Statistic that
represents amount of dispersion in a dataset.
• In Measure of Central Tendency describes the typical value, Measure
of variability defines how far away the data points tend to fall from
the center.
Range
• Range is the difference between the largest and smallest values in a
dataset. It is one of the method in Measures of Dispersion/Variability.
# Sample data
data = {4, 6, 9, 3, 7}

range = max(data) - min(data)print("Maximum Value : ", max(data))


print("Minimum Value : ", min(data))print("Range : ", range)"""
Output
>>>>Maximum Value : 9
>>>>Minimum Value : 3
>>>>Range : 6
"""

The range can sometimes be misleading when there are


extremely high or low values.
Example: In {8, 11, 5, 9, 7, 6, 2500}:
•the lowest value is 5,
•and the highest is 2500,
So the range is 2500 − 5 = 2495.
So we may be better off using Interquartile
Range or Standard Deviation
Variance
• Variance is one of the Measure of dispersion/variability. It gives, how the
data points varied from the Measure of Central Tendency.

Population Variance
Finding the Variance for the Population data is known as Population
Variance

Sample Variance
Finding the Variance to the Sample data is known
as Sample Variance.
Why the numerator is Squared in Variance???
Because, if you didn’t Square the Terms, the opposite
signs of (+ve and -ve) values cancel each other and
hence it tends to zero.
In order to avoid this, we are squaring the values and
hence the values becomes (+ve).
Standard Deviation
Standard Deviation denotes “How the data points deviates
from the Measure of Central Tendency”. The Square root of
Variance is Standard Deviation.
Population Standard Deviation
Finding the Std. Dev for Population data is known
as Population Standard Deviation
data = {4, 6, 9, 3, 7}
Random Variables

• a random variable is a real valued function whose domain is the


sample space of the random experiment.
• It means that each outcome of a random experiment is associated
with a single real number, and the single real number may vary with
the different outcomes of a random experiment.
• Hence, it is called a random variable and it is generally represented
by the letter “X”.
For example, let us consider an experiment for tossing a coin two times.
Hence, the sample space for this experiment is S = {HH, HT, TH, TT}.
If X is a random variable and it denotes the number of heads obtained,
then the values are represented as follows:
X(HH) = 2, X(HT) = 1, X(TH) = 1, X(TT) = 0.
Similarly, we can define the number of tails obtained using another
variable, say Y.
(i.e.) Y(HH) = 0, Y(HT) = 1, Y(TH) = 1, Y(TT)= 2.
• There are two basic types of random variables:
• Discrete Random Variables (which take on specific values).
• Continuous Random Variables (assume any value within a given range).
• We define a random variable as a function that maps from the sample
space of an experiment to the real numbers.
• Mathematically, Random Variable is expressed as:
X: S →R
where,
X is Random Variable (It is usually denoted using capital letter)
S is Sample Space
R is Set of Real Numbers
Suppose a random variable X takes m different values, , with corresponding
probabilities , where .
The probabilities must satisfy the following conditions:
; where

or we can say and


Suppose a die is thrown (X = outcome of the dice).
Here, the sample space S = {1, 2, 3, 4, 5, 6}.
The output of the function will be:
P(X = 1) = 1/6
P(X = 2) = 1/6
P(X = 3) = 1/6
P(X = 4) = 1/6
P(X = 5) = 1/6
P(X = 6) = 1/6
this also satisfies the condition , since:
Types of Random Variables
• Discrete Random Variable
• Continuous Random Variable
Discrete Random Variable
• A Discrete Random Variable takes on a finite number of values. The
probability function associated with it is said to be Probability Mass
Function (PMF).

If X is a discrete random variable and the PMF of X is P(xi), then


0 ≤ pi ≤ 1

∑pi = 1 where the sum is taken over all possible values of x


Example: Let S = {0, 1, 2}
xi 0 1 2
Pi(X = xi) P1 0.3 0.5

Find the value of P (X = 0)

We know that the sum of all probabilities is equal to 1. And P (X = 0) be


P1
P1 + 0.3 + 0.5 = 1
P1 = 0.2
Then, P (X = 0) is 0.2
Continuous Random Variable
• Continuous Random Variable takes on an infinite number of values. The probability

function associated with it is said to be PDF (Probability Density Function).

PDF (Probability Density Function)

If is a continuous random variable. then,


• ; for all x

• over all values of

Then is said to be a PDF of the distribution.


Find the value of P (1 < X < 2), Such that: f(x) = kx3; 0 ≤ x ≤ 3 = 0
Otherwise, f(x) is a density function.
If a function f is said to be a density function, then the sum of all
probabilities is equal to 1.
Since it is a continuous random variable Integral value is 1 overall sample
space s.
∫f(x) dx = 1; ∫kx3 dx = 1; k[x4]/4 = 1
Given interval, 0 ≤ x ≤ 3 = 0
k[34 – 04]/4 = 1; k(81/4) = 1; k = 4/81
Thus,
P (1 < X < 2) = k × [X4]/4; P = 4/81 × [16-1]/4; P = 15/81
Probability Distributions
• Probability distributions describe what we think the probability of
each outcome is, which is sometimes more interesting to know than
simply which single outcome is most likely.
• probabilities in a distribution always add up to 1.
Example: flipping a fair coin has two outcomes: it lands heads or tails.
Before the flip, we believe there’s a 1 in 2 chance, or 0.5 probability, of
heads. The same is true for tails. That’s a probability distribution over
the two outcomes of the flip.
This is an example of Bernoulli Distribution.
Types of Distributions
There are two major classes of probability distributions.
a) Discrete; b) Continuous

1. Bernoulli Distribution
2. Uniform Distribution Discrete
3. Binomial Distribution
4. Normal or Gaussian Distribution
5. Exponential Distribution Continuous
6. Poisson Distribution
Binomial
• It is the representation of the probability when only two events may
happen, that are mutually exclusive.

P(x:n,p) = nCx px (1-p)n-x or P(x:n,p) = nCx px (q)n-x

Where,

n = the number of experiments

x = 0, 1, 2, 3, 4, …

p = Probability of Success in a single experiment

q = Probability of Failure in a single experiment = 1 – p


Example 1: If a coin is tossed 5 times, find the
probability of:
Exactly 2 heads; (b) At least 4 heads.

a) Number of trials: n=5 b) For at least four heads,


Probability of head: p= 1/2 and hence the x ≥ 4, P(x ≥ 4) = P(x = 4) + P(x=5)
probability of tail, q =1/2 Hence,

For exactly two heads: x=2 P(x = 4) = 5C4 p4 q5-4 = 5!/4! 1! × (½)4× (½)1 =
5/32
P(x=2) = 5C2 p2 q5-2 = 5! / 2! 3! × (½)2× (½)3
P(x=2) = 5/16 P(x = 5) = 5C5 p5 q5-5 = (½)5 = 1/32

Therefore, P(x ≥ 4) = 5/32 + 1/32 = 6/32 = 3/16

P(x:n,p) = nCx px (1-p)n-x or P(x:n,p)


= nCx px (q)n-x

You might also like