Probability
Probability
UNIT III
UNIT III
Unit 3 Probability COs
A Probability: Conditional Probability; CO3,CO6
B Mean, Median, Mode and Standard CO3,CO6
Deviation;
C Random Variables; Distributions; CO3,CO6
Probability
• Probability is defined as the likelihood of the occurrence of any event.
• Probability is expressed as a number between 0 and 1, where:
• 0 is the probability of an impossible event and 1 is the probability of a sure
event.
Sample Space (Ω/S)
• A Sample Space is the set of all possible outcomes of an experiment
or a random phenomenon. Sample Space is denoted by the symbol
“S” and represents all the possible outcomes that can occur.
• E.g. when flipping a coin, the sample space is {heads, tails}, because those are
the only two possible outcomes.
• Similarly, when rolling a six-sided die, the sample space is {1, 2, 3, 4, 5, 6},
because those are the only possible outcomes.
Event
• An event can be defined as any outcome or set of outcomes from a
random experiment.
• An event in probability is the subset of the respective sample space.
Example:
1. If you roll a die, the event could be “getting a 3” or “getting an even
number.”
2. If you toss two coins simultaneously , the event could be getting
“getting at least 1 heads” or “getting two tails”.
Probability of an Event
• The probability of an event E, denoted by P(E), is a number between 0 and 1
that represents the likelihood of Event E occurring.
• The sum of the probabilities of all events in a sample space is always equal to 1
• In a rolling die experiment
Possible Outcomes : { 1, 2, 3, 4, 5, 6 }
then , P(1) + P(2) + P(3) + P(4) + P(5) + P(6) = 1
Types of Events
Dependent and Independent Events:
Dependent events are those in which the probability of an event
changes based on previous outcomes.
• Example 1: Drawing two cards from a deck without replacement.
If you draw one card and do not replace it, the total number of cards in the deck
changes. The probability of drawing a specific card on the second draw is
affected by the outcome of the first draw, hence they are dependent events.
• Example 2: Picking a marble from a bag, not replacing it, and then picking
another marble.
If the first marble is not replaced, the total number of marbles changes, which
influences the probability of picking the second marble. Hence, the events are
dependent.
• Independent events are those in which the probability of an event
remains the same, regardless of previous outcomes.
P(B|A) = P(B)
P(A|B) = P(A)
Conditional Probability and Bayes’
Theorem
• It provides a mathematical framework for updating beliefs or
hypotheses in light of new evidence or information. This theorem is
extensively used in various fields, including statistics, machine
learning, and artificial intelligence.
• At its core, Bayes’ Theorem enables us to calculate the probability of a
hypothesis being true given observed evidence. The theorem is
expressed mathematically as follows:
Mean, Median, Mode, Standard
Deviation
• Statistics is a Branch of Mathematics, that transforms your data into
useful insights for Decision Makers.
• Statistics is one of the most important disciplines to provide tools and
methods to find structure in and to give deeper insight into data, and
the most important discipline to analyze and quantify uncertainty.
• It’s a backbone for the Hypothesis Testing, Machine Learning, Deep
Learning concepts etc.
Basic Statistics
print(mean)
# output of numpy
package
print(np.mean(data))
# output of statistics
package
print(statistics.mean(data))
Median
• Median gives the
Middle value of the
sorted data. It is mostly
used in Outlier
detection/removal and
imputing missing values
while doing data
preprocessing in the
data.
import statistics
import numpy as np
data = [1,2,4,5,6,76,8,45]
# Using Formula without Python Packages
"""
If number of elements = odd - - - -> n/2
If number of elements = even - - → (n+1)/2
We can't use the (n+1)/2 exactly in coding. Because finding the position using float values gets error. So, I am
slightly changing the formula for Even.
m1 = (n/2)th position
m2 = ((n/2) - 1) th position
"""
sorted_data_median = sorted(data)
print("Sorted Data:", sorted_data_median)
m1 = int(len(sorted_data_median)/ 2)
m2 = int((m1 - 1))
print(f"Position of the data: {m1} and {m2}")
print(f"Values in the Position: {sorted_data_median[m1]} and {sorted_data_median[m2]}")
median = (sorted_data_median[m1] + sorted_data_median[m2])/2
print("Median:", median)
m1 = (n/2)th position
m2 = ((n/2) - 1) th position
5.5
Mode
Population Variance
Finding the Variance for the Population data is known as Population
Variance
Sample Variance
Finding the Variance to the Sample data is known
as Sample Variance.
Why the numerator is Squared in Variance???
Because, if you didn’t Square the Terms, the opposite
signs of (+ve and -ve) values cancel each other and
hence it tends to zero.
In order to avoid this, we are squaring the values and
hence the values becomes (+ve).
Standard Deviation
Standard Deviation denotes “How the data points deviates
from the Measure of Central Tendency”. The Square root of
Variance is Standard Deviation.
Population Standard Deviation
Finding the Std. Dev for Population data is known
as Population Standard Deviation
data = {4, 6, 9, 3, 7}
Random Variables
1. Bernoulli Distribution
2. Uniform Distribution Discrete
3. Binomial Distribution
4. Normal or Gaussian Distribution
5. Exponential Distribution Continuous
6. Poisson Distribution
Binomial
• It is the representation of the probability when only two events may
happen, that are mutually exclusive.
Where,
x = 0, 1, 2, 3, 4, …
For exactly two heads: x=2 P(x = 4) = 5C4 p4 q5-4 = 5!/4! 1! × (½)4× (½)1 =
5/32
P(x=2) = 5C2 p2 q5-2 = 5! / 2! 3! × (½)2× (½)3
P(x=2) = 5/16 P(x = 5) = 5C5 p5 q5-5 = (½)5 = 1/32