217 - Chapter 2 - Organization of Data
217 - Chapter 2 - Organization of Data
Class Limits
Class limits are the smallest and largest
observations (data, events etc) in each class.
Therefore, each class has two limits: a lower and
upper.
Class Frequency
200 – 299 12
300 – 399 19
400 – 499 6
500 – 599 2
600 – 699 11
700 – 799 7
800 – 899 3
Total Frequency 60
Question:
Using the frequency table above, what are the
lower and upper class limits for the first three
classes?
Answer:
●
For the first class, 200 – 299
●
The lower class limit is 200
●
The upper class limit is 299
●
For the second class, 300 – 399
●
The lower class limit is 300
●
The upper class limit is 399
●
For the third class, 400 – 499
●
The lower class limit is 400
●
The upper class limit is 499
Midpoint
●
The midpoint of the class interval :
●
Let b=the highest number in the class, a = the lowest number in
the class.
●
The midpoint is (a+ 0.5 * (b-a)).
●
Class Boundaries are the midpoints between the
upper class limit of a class and the lower class
limit of the next class in the sequence.
●
Therefore, each class has an upper and lower
class boundary.
Answer
For the first class, 200 – 299
The lower class boundary is the midpoint between 199 and 200, that
is 199.5
The upper class boundary is the midpoint between 299 and 300, that
is 299.5
●
Class interval is the difference between the upper
and lower class boundaries of any class.
Answer:
For the first class, 200 – 299
The class interval =
Upper class boundary – lower class boundary
Upper class boundary = 299.5
Lower class boundary = 199.5
Therefore, the class interval = 299.5 – 199.5 = 100.
1- FREQUENCY DISTRIBUTION
1- FREQUENCY DISTRIBUTION
●
In statistics, a frequency distribution is a table that
displays the frequency of various outcomes in a
sample.
●
Each entry in the table contains the frequency or
count of the occurrences of values within a
particular group or interval, and in this way, the
table summarizes the distribution of values in the
sample.
Example:
●
These are the numbers of newspapers sold at a local shop over the last 10 days:
22, 20, 18, 23, 20, 25, 22, 20, 18, 20
●
Let us count how many of each number there is:
Example:
●
It is also possible to group the values.
Here they are grouped in 5s:
15 - 19 2
20 - 24 7
25 - 29 1
2- STEM-AND-LEAF DISPLAY
2- STEM-AND-LEAF DISPLAY
●
A basic stem-and-leaf display contains two
columns separated by a vertical line.
●
The left column contains the stems and the right
column contains the leaves.
●
Stem-and-leaf displays are useful for displaying the
relative density and shape of the data, giving the
reader a quick overview of distribution.
●
Examples
●
A simple bar chart is used to represents data
involving only one variable classified on spatial,
quantitative or temporal basis.
●
In simple bar chart, we make bars of equal width
but variable length, i.e. the magnitude of a
quantity is represented by the height or length
of the bars.
●
Following steps are undertaken in drawing a
simple bar diagram.
22 Topic 2: Organisation of Data 217CSM-3: Statistical Programming
3- SIMPLE BAR CHART
●
By multiple bars diagram two or more sets of
inter-related data are represented (multiple bar
diagram facilities comparison between more
than one phenomena).
●
The technique of simple bar chart is used to draw
this diagram but the difference is that we use
different shades, colors, or dots to distinguish
between different phenomena.
●
We use to draw multiple bar charts if the total of
different phenomena is meaningless.
26● Topic 2: Organisation of Data 217CSM-3: Statistical Programming
4- MULTIPLE BAR CHART
●
Sub-divided or component bar chart is used to
represent data in which the total magnitude is
divided into different or components.
●
In this diagram, first we make simple bars for each
class taking total magnitude in that class and then
divide these simple bars into parts in the ratio of
various components.
●
This type of diagram shows the variation in different
components within each class as well as between
different classes.
30 Topic 2: Organisation of Data 217CSM-3: Statistical Programming
5- COMPONENT BAR CHART
1991 34 18 27 79
1992 43 14 24 81
1993 43 16 27 86
1994 45 13 34 92
6- PIE CHART
6- PIE CHART
Food 600$
Clothing 100$
House Rent 400$
Fuel and Lighting 100$
Miscellaneous 300$
Total 1500$
7- HISTOGRAM
7- HISTOGRAM
●
In statistics, a histogram is a graphical representation of the
distribution of data.
●
It is an estimate of the probability distribution of a
continuous variable.
●
A histogram is a representation of tabulated frequencies,
shown as adjacent rectangles or squares.
●
The rectangles of a histogram are drawn so that they
touch each other to indicate that the original variable is
continuous.
Bin Count
−3.5 to -2.51 9
−2.5 to -1.51 32
−1.5 to -0.51 109
−0.5 to 0.49 180
0.5 to 1.49 132
1.5 to 2.49 34
2.5 to 3.49 4
8- FREQUENCY POLYGON
8- FREQUENCY POLYGON
●
Midpoints of the interval of corresponding
rectangle in a histogram are joined together by
straight lines.
●
It gives a polygon (a figure with many angles).
●
Examples:
●
Sometimes it is beneficial to show the histogram
and frequency polygon together.
Exercise:
●
The number of hours each student of a class
spends for studying are 5, 8, 7, 6, 8, 8, 5, 5, 6, 5, 6,
and 5 hours.
●
TODO : Identify the frequency polygon for the
given data.
Choices:
A)Figure 1
B)Figure 2
C)Figure 3
D)Figure 4
Correct Answer: A
●
Step 1: The heights of the bars in the histograms indicate
the number of hours that the students of a class spend.
●
Step 2: Observe the heights of the bars in the histograms to
match with the given values.
●
Step 3: Compute the midpoints of the bars and draw the
lines along these midpoints.
●
Step 4: Frequency polygon of this data will be as follows.
●
Step 5: So, Figure 1 represents the frequency polygon for
the given data.