0% found this document useful (0 votes)
64 views

Calculating Standard Deviation Step by Step

A frequency distribution describes how often each value of a variable occurs in a dataset by showing the number of observations for each possible value. There are four types of frequency distributions: ungrouped, grouped, relative, and cumulative. Frequency distributions are often displayed using frequency tables or graphs like pie charts, bar charts, and histograms. The type of graph used depends on whether the variable is categorical or quantitative.

Uploaded by

Bayissa Bekele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views

Calculating Standard Deviation Step by Step

A frequency distribution describes how often each value of a variable occurs in a dataset by showing the number of observations for each possible value. There are four types of frequency distributions: ungrouped, grouped, relative, and cumulative. Frequency distributions are often displayed using frequency tables or graphs like pie charts, bar charts, and histograms. The type of graph used depends on whether the variable is categorical or quantitative.

Uploaded by

Bayissa Bekele
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 21

Frequency Distribution | Tables, Types &

Examples
Published on June 7, 2022 by Shaun Turney. Revised on June 21, 2023.

A frequency distribution describes the number of observations for each possible value
of a variable. Frequency distributions are depicted using graphs and frequency tables.

Example: Frequency distributionIn the 2022 Winter Olympics, Team USA won 25 medals. This
frequency table gives the medals’ values (gold, silver, and bronze) and frequencies:

Table of contents

1.
2.
3.
4.
5.

What is a frequency distribution?


The frequency of a value is the number of times it occurs in a dataset. A frequency
distribution is the pattern of frequencies of a variable. It’s the number of times each
possible value of a variable occurs in a dataset.

Types of frequency distributions


There are four types of frequency distributions:

 Ungrouped frequency distributions: The number of observations of


each value of a variable.
o You can use this type of frequency distribution for categorical variables.
 Grouped frequency distributions: The number of observations of each class
interval of a variable. Class intervals are ordered groupings of a variable’s
values.
o You can use this type of frequency distribution for quantitative variables.
 Relative frequency distributions: The proportion of observations of each value
or class interval of a variable.
o You can use this type of frequency distribution for any type of
variable when you’re more interested in comparing frequencies than the
actual number of observations.
 Cumulative frequency distributions: The sum of the frequencies less than or
equal to each value or class interval of a variable.
o You can use this type of frequency distribution for ordinal or quantitative
variables when you want to understand how often observations fall
below certain values.

Here's why students love Scribbr's proofreading


services
Discover proofreading & editing

How to make a frequency table


Frequency distributions are often displayed using frequency tables. A frequency table
is an effective way to summarize or organize a dataset. It’s usually composed of two
columns:

 The values or class intervals


 Their frequencies

The method for making a frequency table differs between the four types of frequency
distributions. You can follow the guides below or use software such as Excel, SPSS, or
R to make a frequency table.

How to make an ungrouped frequency table

1. Create a table with two columns and as many rows as there are values of the
variable. Label the first column using the variable name and label the second
column “Frequency.” Enter the values in the first column.
o For ordinal variables, the values should be ordered from smallest to
largest in the table rows.
o For nominal variables, the values can be in any order in the table. You
may wish to order them alphabetically or in some other logical order.
2. Count the frequencies. The frequencies are the number of times each value
occurs. Enter the frequencies in the second column of the table beside their
corresponding values.
o Especially if your dataset is large, it may help to count the frequencies
by tallying. Add a third column called “Tally.” As you read the
observations, make a tick mark in the appropriate row of the tally column
for each observation. Count the tally marks to determine the frequency.

Example: Making an ungrouped frequency tableA gardener set up a bird feeder in their
backyard. To help them decide how much and what type of birdseed to buy, they decide to
record the bird species that visit their feeder. Over the course of one morning, the following birds
visit their feeder:

How to make a grouped frequency table

1. Divide the variable into class intervals. Below is one method to divide a
variable into class intervals. Different methods will give different answers, but
there’s no agreement on the best method to calculate class intervals.
o Calculate the range. Subtract the lowest value in the dataset from the
highest.
o Decide the class interval width. There are no firm rules on how to
choose the width, but the following formula is a rule of thumb:

You can round this value to a whole number or a number that’s convenient
to add (such as a multiple of 10).

o Calculate the class intervals. Each interval is defined by a lower limit


and upper limit. Observations in a class interval are greater than or equal
to the lower limit and less than the upper limit:
The lower limit of the first interval is the lowest value in the dataset. Add
the class interval width to find the upper limit of the first interval and the
lower limit of the second variable. Keep adding the interval width to
calculate more class intervals until you exceed the highest value.

2. Create a table with two columns and as many rows as there are class intervals.
Label the first column using the variable name and label the second column
“Frequency.” Enter the class intervals in the first column.
3. Count the frequencies. The frequencies are the number of observations in each
class interval. You can count by tallying if you find it helpful. Enter the
frequencies in the second column of the table beside their corresponding class
intervals.

Example: Grouped frequency distributionA sociologist conducted a survey of 20 adults. She


wants to report the frequency distribution of the ages of the survey respondents. The respondents
were the following ages in years:
52, 34, 32, 29, 63, 40, 46, 54, 36, 36, 24, 19, 45, 20, 28, 29, 38, 33, 49, 37

Round the class interval width to 10.

The class intervals are 19 ≤ a < 29, 29 ≤ a < 39, 39 ≤ a < 49, 49 ≤ a < 59, and 59 ≤ a < 69.
How to make a relative frequency table

1. Create an ungrouped or grouped frequency table.


2. Add a third column to the table for the relative frequencies. To calculate the
relative frequencies, divide each frequency by the sample size. The sample size
is the sum of the frequencies.

Example: Relative frequency distribution


From this table, the gardener can make observations, such as that 19% of the bird feeder visits
were from chickadees and 25% were from finches.

How to make a cumulative frequency table

1. Create an ungrouped or grouped frequency table for an ordinal or quantitative


variable. Cumulative frequencies don’t make sense for nominal variables
because the values have no order—one value isn’t more than or less than
another value.
2. Add a third column to the table for the cumulative frequencies. The
cumulative frequency is the number of observations less than or equal to a
certain value or class interval. To calculate the relative frequencies, add each
frequency to the frequencies in the previous rows.
3. Optional: If you want to calculate the cumulative relative frequency, add
another column and divide each cumulative frequency by the sample size.

Example: Cumulative frequency distribution

From this table, the sociologist can make observations such as 13 respondents (65%) were under
39 years old, and 16 respondents (80%) were under 49 years old.

How to graph a frequency distribution


Pie charts, bar charts, and histograms are all ways of graphing frequency distributions.
The best choice depends on the type of variable and what you’re trying to communicate.

Pie chart
A pie chart is a graph that shows the relative frequency distribution of a nominal
variable.

A pie chart is a circle that’s divided into one slice for each value. The size of the slices
shows their relative frequency.

This type of graph can be a good choice when you want to emphasize that one variable
is especially frequent or infrequent, or you want to present the overall composition of a
variable.

A disadvantage of pie charts is that it’s difficult to see small differences between
frequencies. As a result, it’s also not a good option if you want to compare the
frequencies of different values.

Bar chart
A bar chart is a graph that shows the frequency or relative frequency distribution of
a categorical variable (nominal or ordinal).

The y-axis of the bars shows the frequencies or relative frequencies, and the x-axis
shows the values. Each value is represented by a bar, and the length or height of the
bar shows the frequency of the value.

A bar chart is a good choice when you want to compare the frequencies of different
values. It’s much easier to compare the heights of bars than the angles of pie chart
slices.

Histogram
A histogram is a graph that shows the frequency or relative frequency distribution of
a quantitative variable. It looks similar to a bar chart.

The continuous variable is grouped into interval classes, just like a grouped frequency
table. The y-axis of the bars shows the frequencies or relative frequencies, and the x-
axis shows the interval classes. Each interval class is represented by a bar, and the
height of the bar shows the frequency or relative frequency of the interval class.

Although bar charts and histograms are similar, there are important differences:

Bar chart Histogram

Type of variable Categorical Quantitative

Value grouping Ungrouped (values) Grouped (interval classes)

Bar spacing Can be a space between bars Never a space between bars

Bar order Can be in any order Can only be ordered from lowest to highest

A histogram is an effective visual summary of several important characteristics of a


variable. At a glance, you can see a variable’s central tendency and variability, as well
as what probability distribution it appears to follow, such as a normal, Poisson, or
uniform distribution.

Skip to main content


Log in or Sign up to save your future progress! All content is 100% free.
Courses

Search

Get AI GuideDonateLog inSign up


Main content

Statistics and probability


COURSE: STATISTICS AND PROBABILITY > UNIT 3

Lesson 4: Variance and standard deviation of a population



Measures of spread: range, variance & standard deviation


Variance of a population


Population standard deviation


The idea of spread and standard deviation


Calculating standard deviation step by step


Standard deviation of a population
Not started


Mean and standard deviation versus median and IQR


Concept check: Standard deviation


Statistics: Alternate variance formulas

Math>
Statistics and probability>
Summarizing quantitative data>
Variance and standard deviation of a population
© 2023 Khan Academy
Terms of usePrivacy PolicyCookie Notice

Skip to main content


Log in or Sign up to save your future progress! All content is 100% free.
Courses

Search

Get AI GuideDonateLog inSign up


Main content
Statistics and probability
COURSE: STATISTICS AND PROBABILITY > UNIT 3
Lesson 4: Variance and standard deviation of a population

Measures of spread: range, variance & standard deviation


Variance of a population


Population standard deviation


The idea of spread and standard deviation


Calculating standard deviation step by step


Standard deviation of a population
Not started


Mean and standard deviation versus median and IQR


Concept check: Standard deviation


Statistics: Alternate variance formulas

Math>
Statistics and probability>
Summarizing quantitative data>
Variance and standard deviation of a population
© 2023 Khan Academy
Terms of usePrivacy PolicyCookie Notice

Calculating standard deviation step by step


Google Classroom
Introduction
In this article, we'll learn how to calculate standard deviation "by
hand".

Interestingly, in the real world no statistician would ever calculate


standard deviation by hand. The calculations involved are somewhat
complex, and the risk of making a mistake is high. Also, calculating by
hand is slow. Very slow. This is why statisticians rely on spreadsheets
and computer programs to crunch their numbers.

So what's the point of this article? Why are we taking time to learn a
process statisticians don't actually use? The answer is that learning to
do the calculations by hand will give us insight into how standard
deviation really works. This insight is valuable. Instead of viewing
standard deviation as some magical number our spreadsheet or
computer program gives us, we'll be able to explain where that
number comes from.

Overview of how to calculate standard deviation


The formula for standard deviation (SD) is

SD=∑∣�−�∣2�SD=N∑∣x−μ∣2start text, S, D, end text, equals,


square root of, start fraction, sum, start subscript, end subscript, start
superscript, end superscript, open vertical bar, x, minus, mu, close
vertical bar, squared, divided by, N, end fraction, end square root

where ∑∑sum means "sum of", �xx is a value in the data


set, �μmu is the mean of the data set, and �NN is the number of
data points in the population.

The standard deviation formula may look confusing, but it will make
sense after we break it down. In the coming sections, we'll walk
through a step-by-step interactive example. Here's a quick preview of
the steps we're about to follow:

Step 1: Find the mean.

Step 2: For each data point, find the square of its distance to the
mean.

Step 3: Sum the values from Step 2.

Step 4: Divide by the number of data points.

Step 5: Take the square root.

An important note
The formula above is for finding the standard deviation of a
population. If you're dealing with a sample, you'll want to use a slightly
different formula (below), which uses �−1n−1n, minus, 1 instead
of �NN. The point of this article, however, is to familiarize you with
the process of computing standard deviation, which is basically the
same no matter which formula you use.
SDsample=∑∣�−�ˉ∣2�−1SDsample=n−1∑∣x−xˉ∣2start text, S, D, end text,
start subscript, start text, s, a, m, p, l, e, end text, end subscript,
equals, square root of, start fraction, sum, start subscript, end
subscript, start superscript, end superscript, open vertical bar, x,
minus, x, with, \bar, on top, close vertical bar, squared, divided by, n,
minus, 1, end fraction, end square root
[Why are there two formulas?]

That's a great question, but it is difficult to answer succinctly. We have


a lot of videos and simulations on this topic—it's fairly complex and
quite interesting.

If you want to learn about the distinction between population and


sample standard deviation, and why they're not calculated the same
way, you should head to the lesson on sample variance and standard
deviation.

Onward!

Step-by-step interactive example for calculating


standard deviation
First, we need a data set to work with. Let's pick something small so
we don't get overwhelmed by the number of data points. Here's a
good one:

6,2,3,16,2,3,16, comma, 2, comma, 3, comma, 1

Step 1: Finding �μstart color #e07d10, mu, end color


#e07d10 in ∑∣�−�∣2�N∑∣x−μ∣2square root of, start
fraction, sum, start subscript, end subscript, start
superscript, end superscript, open vertical bar, x,
minus, start color #e07d10, mu, end color #e07d10, close
vertical bar, squared, divided by, N, end fraction, end
square root
In this step, we find the mean of the data set, which is represented by
the variable �μmu.

Fill in the blank.

�=μ=mu, equals

Check

[Hide explanation]

�=6+2+3+14=124=3μ=46+2+3+1=412=3mu, equals, start fraction, 6,


plus, 2, plus, 3, plus, 1, divided by, 4, end fraction, equals, start
fraction, 12, divided by, 4, end fraction, equals, start color #11accd, 3,
end color #11accd

Step 2: Finding ∣�−�∣2∣x−μ∣2start color #e07d10, open


vertical bar, x, minus, mu, close vertical bar, squared,
end color #e07d10 in ∑∣�−�∣2�N∑∣x−μ∣2square root
of, start fraction, sum, start subscript, end subscript,
start superscript, end superscript, start color #e07d10,
open vertical bar, x, minus, mu, close vertical bar,
squared, end color #e07d10, divided by, N, end fraction,
end square root
In this step, we find the distance from each data point to the mean
(i.e., the deviations) and square each of those distances.
For example, the first data point is 666 and the mean is 333, so the
distance between them is 333. Squaring this distance gives us 999.

Complete the table below.

Data Square of the distance from the mean ∣�−�∣2∣x−μ∣2open


point �xx vertical bar, x, minus, mu, close vertical bar, squared
666 999

222

333

111
Check

[Hide explanation]

Data Distance from the mean squared ∣�−�∣2∣x−μ∣2open


point �xx vertical bar, x, minus, mu, close vertical bar, squared
∣6−3∣2=32=9∣6−3∣2=32=9open vertical bar, 6, minus, start
color #11accd, 3, end color #11accd, close vertical bar,
666 squared, equals, 3, squared, equals, 9
∣2−3∣2=12=1∣2−3∣2=12=1open vertical bar, 2, minus, start
color #11accd, 3, end color #11accd, close vertical bar,
222 squared, equals, 1, squared, equals, 1
∣3−3∣2=02=0∣3−3∣2=02=0open vertical bar, 3, minus, start
color #11accd, 3, end color #11accd, close vertical bar,
333 squared, equals, 0, squared, equals, 0

111 ∣1−3∣2=22=4∣1−3∣2=22=4open vertical bar, 1, minus, start


color #11accd, 3, end color #11accd, close vertical bar,
Data Distance from the mean squared ∣�−�∣2∣x−μ∣2open
point �xx vertical bar, x, minus, mu, close vertical bar, squared
squared, equals, 2, squared, equals, 4
Step 3: Finding ∑∣�−�∣2∑∣x−μ∣2start color #e07d10,
sum, open vertical bar, x, minus, mu, close vertical bar,
squared, end color #e07d10 in ∑∣�−�∣2�N∑∣x−μ∣2
square root of, start fraction, start color #e07d10, sum,
start subscript, end subscript, start superscript, end
superscript, open vertical bar, x, minus, mu, close
vertical bar, squared, end color #e07d10, divided by, N,
end fraction, end square root
The symbol ∑∑sum means "sum", so in this step we add up the four
values we found in Step 2.

Fill in the blank.

∑∣�−�∣2=∑∣x−μ∣2=sum, open vertical bar, x, minus, mu, close vertical


bar, squared, equals

Check

[Hide explanation]

Add up all of the squared distances from the data points to the mean
from Step 2:

∑∣�−�∣2=9+1+0+4=14∑∣x−μ∣2=9+1+0+4=14sum, open vertical bar, x,


minus, mu, close vertical bar, squared, equals, 9, plus, 1, plus, 0, plus,
4, equals, 14

Step 4: Finding ∑∣�−�∣2�N∑∣x−μ∣2start color


#e07d10, start fraction, sum, open vertical bar, x,
minus, mu, close vertical bar, squared, divided by, N,
end fraction, end color #e07d10 in ∑∣�−�∣2�N∑
∣x−μ∣2square root of, start color #e07d10, start fraction,
sum, start subscript, end subscript, start superscript,
end superscript, open vertical bar, x, minus, mu, close
vertical bar, squared, divided by, N, end fraction, end
color #e07d10, end square root
In this step, we divide our result from Step 3 by the variable �NN,
which is the number of data points.

Fill in the blank.

∑∣�−�∣2�=N∑∣x−μ∣2=start fraction, sum, open vertical bar, x, minus,


mu, close vertical bar, squared, divided by, N, end fraction, equals

Check

[Hide explanation]

Divide the sum from Step 3 by the number of data points (�=4)
(N=4)left parenthesis, N, equals, 4, right parenthesis:

∑∣�−�∣2�=144=3.5N∑∣x−μ∣2=414=3.5start fraction, sum, open vertical


bar, x, minus, mu, close vertical bar, squared, divided by, N, end
fraction, equals, start fraction, 14, divided by, 4, end fraction, equals,
3, point, 5

Step 5: Finding the standard deviation ∑∣�−�∣2�N∑


∣x−μ∣2square root of, start fraction, sum, start subscript,
end subscript, start superscript, end superscript, open
vertical bar, x, minus, mu, close vertical bar, squared,
divided by, N, end fraction, end square root
We're almost finished! Just take the square root of the answer from
Step 4 and we're done.

Fill in the blank.


Round your answer to the nearest hundredth.

SD=∑∣�−�∣2�≈SD=N∑∣x−μ∣2≈start text, S, D, end text, equals,


square root of, start fraction, sum, start subscript, end subscript, start
superscript, end superscript, open vertical bar, x, minus, mu, close
vertical bar, squared, divided by, N, end fraction, end square root,
approximately equals

Check

[Hide explanation]

Take the square root of the number we found in Step 4:

∑∣�−�∣2�=3.5≈1.87N∑∣x−μ∣2=3.5≈1.87square root of, start fraction,


sum, start subscript, end subscript, start superscript, end superscript,
open vertical bar, x, minus, mu, close vertical bar, squared, divided by,
N, end fraction, end square root, equals, square root of, 3, point, 5,
end square root, approximately equals, 1, point, 87

The standard deviation is 1.871.871, point, 87.

Yes! We did it! We successfully calculated the standard deviation of a small


data set.

Summary of what we did


We broke down the formula into five steps:

Step 1: Find the mean �μmu.


�=6+2+3+14=124=3μ=46+2+3+1=412=3mu, equals, start fraction, 6,
plus, 2, plus, 3, plus, 1, divided by, 4, end fraction, equals, start
fraction, 12, divided by, 4, end fraction, equals, start color #11accd, 3,
end color #11accd

Step 2: Find the square of the distance from each data point to the
mean ∣�−�∣2∣x−μ∣2open vertical bar, x, minus, mu, close vertical bar,
squared.

∣�−�∣2∣x−μ∣2open vertical bar, x, minus, mu, close vertical


�xx bar, squared
∣6−3∣2=32=9∣6−3∣2=32=9open vertical bar, 6, minus, start color
#11accd, 3, end color #11accd, close vertical bar, squared, equals,
666 3, squared, equals, 9
∣2−3∣2=12=1∣2−3∣2=12=1open vertical bar, 2, minus, start color
#11accd, 3, end color #11accd, close vertical bar, squared, equals,
222 1, squared, equals, 1
∣3−3∣2=02=0∣3−3∣2=02=0open vertical bar, 3, minus, start color
#11accd, 3, end color #11accd, close vertical bar, squared, equals,
333 0, squared, equals, 0
∣1−3∣2=22=4∣1−3∣2=22=4open vertical bar, 1, minus, start color
#11accd, 3, end color #11accd, close vertical bar, squared, equals,
111 2, squared, equals, 4
Steps 3, 4, and 5:

SD=∑∣�−�∣2�=9+1+0+44=144 Sum the squares of the distances (Ste


p 3).=3.5 Divide by the number of data points (Step 4).≈1.87 Take th
e square root (Step 5).SD=N∑∣x−μ∣2=49+1+0+4=414 Sum the squares of th
e distances (Step 3).=3.5 Divide by the number of data points (Step 4).≈1.87
Take the square root (Step 5).

Try it yourself
Here's a reminder of the formula:

SD=∑∣�−�∣2�SD=N∑∣x−μ∣2start text, S, D, end text, equals,


square root of, start fraction, sum, start subscript, end subscript, start
superscript, end superscript, open vertical bar, x, minus, mu, close
vertical bar, squared, divided by, N, end fraction, end square root

And here's a data set:

1,4,7,2,61,4,7,2,61, comma, 4, comma, 7, comma, 2, comma, 6

Find the standard deviation of the data set.


Round your answer to the nearest hundredth.

SD=SD=start text, S, D, end text, equals

We couldn't grade your answer. It looks like you left something blank or entered in an invalid answer.

Check

[Hide explanation]

Find the mean


�=1+4+7+2+65=205=4μ=51+4+7+2+6=520=4mu, equals, start fraction,
1, plus, 4, plus, 7, plus, 2, plus, 6, divided by, 5, end fraction, equals,
start fraction, 20, divided by, 5, end fraction, equals, start color
#11accd, 4, end color #11accd

Find the square of the distances from each of the data


points to the mean
∣�−�∣2∣x−μ∣2open vertical bar, x, minus, mu, close vertical
�xx bar, squared
∣1−4∣2=32=9∣1−4∣2=32=9open vertical bar, 1, minus, start color
#11accd, 4, end color #11accd, close vertical bar, squared, equals,
111 3, squared, equals, 9
∣4−4∣2=02=0∣4−4∣2=02=0open vertical bar, 4, minus, start color
#11accd, 4, end color #11accd, close vertical bar, squared, equals,
444 0, squared, equals, 0
∣7−4∣2=32=9∣7−4∣2=32=9open vertical bar, 7, minus, start color
#11accd, 4, end color #11accd, close vertical bar, squared, equals,
777 3, squared, equals, 9
∣2−4∣2=22=4∣2−4∣2=22=4open vertical bar, 2, minus, start color
#11accd, 4, end color #11accd, close vertical bar, squared, equals,
222 2, squared, equals, 4
∣6−4∣2=22=4∣6−4∣2=22=4open vertical bar, 6, minus, start color
#11accd, 4, end color #11accd, close vertical bar, squared, equals,
666 2, squared, equals, 4

Apply the formula


SD=∑∣�−�∣2�=9+0+9+4+45=265=5.2≈2.28SD=N∑∣x−μ∣2=59+0+9+4+4
=526=5.2≈2.28

The answer
The standard deviation is approximately 2.282.282, point, 28.

You might also like