Stat - Prob-Q4-Module-7
Stat - Prob-Q4-Module-7
STATISTICS
AND PROBABILITY
Quarter 4 - Module 7
Pearson’s Sample
Correlation Coefficient
NegOr_Q4_Stat_and_Prob11_Module7_v2
NegOr_Q4_Stat_and_Prob11_Module7_v2
Statistics and Probability – Grade 11
Alternative Delivery Mode
Quarter 4 – Module 7: Pearson’s Sample Correlation Coefficient
Republic Act 8293, section 176 states that: No copyright shall subsist in any work of
the Government of the Philippines. However, prior approval of the government agency or office
wherein the work is created shall be necessary for exploitation of such work for profit. Such
agency or office may, among other things, impose as a condition the payment of royalties.
Borrowed materials (i.e., songs, stories, poems, pictures, photos, brand names,
trademarks, etc.) Included in this module are owned by their respective copyright holders.
Every effort has been exerted to locate and seek permission to use these materials from their
respective copyright owners. The publisher ownership over them and authors do not represent
nor claim.
NegOr_Q4_Stat_and_Prob11_Module7_v2
Introductory Message
i NegOr_Q4_Stat_and_Prob11_Module7_v2
I
This module was designed to provide you with fun and meaningful
opportunities for guided and independent learning at your own pace and time. You
will be enabled to process the contents of the learning resource while being an
active learner.
The module is intended for you to calculate the Pearson’s Sample
Correlation Coefficient, and solve problems involving correlation analysis.
Pre-Assessment
A data set consists of eight (x, y) pairs of numbers:
(0, 12), (2, 15), (4, 16), (5, 14), (8, 22), (13, 24), (15, 28), (20, 30)
’s In
Let us recall the four different relationships between the variables x and y.
1 NegOr_Q4_Stat_and_Prob11_Module7_v2
What have you observed?
’s New
Definition:
The Pearson’s correlation coefficient for a collection of n pairs (x, y) of numbers is a
sample is the number r given by the formula
𝑺𝑺𝒙𝒚
𝒓=
√(𝑺𝑺𝒙𝒙 )(𝑺𝑺𝒚𝒚 )
Where:
𝟏 𝟏
𝑺𝑺𝒙𝒙 = ∑ 𝒙𝟐 − 𝒏 (∑ 𝒙)𝟐 𝑺𝑺𝒙𝒚 = ∑ 𝒙𝒚 − 𝒏 (∑ 𝒙)(∑ 𝒚)
𝟏
𝑺𝑺𝒚𝒚 = ∑ 𝒚𝟐 − 𝒏 (∑ 𝒚)𝟐
2 NegOr_Q4_Stat_and_Prob11_Module7_v2
b. If |𝑟| is near 0 (that is, if r is near 0 of either sign) then the linear relationship
between x and y is weak.
The table below shows the verbal description of the strength of the
correlation between two variables.
The closer the value of r to 1 or -1, the stronger the relationships between
the variables. This can be shown in the diagram below.
3 NegOr_Q4_Stat_and_Prob11_Module7_v2
e) Weak to Medium Negative Linear Correlation f) No Linear Correlation
Example 1. Compute the linear correlation coefficient for the height and weight as shown in
the table below.
Height x (in) 68 69 70 70 71 72 72 72 73 73 74 75
Weight y (lbs) 151 146 157 164 171 160 163 180 170 175 178 188
Solutions.
Step 1. Construct a table with the components x, y, x2, xy, y2 on top and with the corresponding
values as shown.
4 NegOr_Q4_Stat_and_Prob11_Module7_v2
Step 2. Compute SSxx , SSxy, and SSyy.
𝟏 𝟐
𝑺𝑺𝒙𝒙 = ∑ 𝒙𝟐 − (∑ 𝒙)
𝒏
= 61537 – (1/12)(859)2
= 46.9167
𝟏
𝑺𝑺𝒙𝒚 = ∑ 𝒙𝒚 − (∑ 𝒙) (∑ 𝒚)
𝒏
= 143626 – (1/12)(859)(2003)
= 244.583
𝟏
𝑺𝑺𝒚𝒚 = ∑ 𝒚𝟐 − 𝒏 (∑ 𝒚)𝟐
= 336025 – (1/12)(2003)2
= 1690.9167
𝑺𝑺𝒙𝒚
𝒓=
√(𝑺𝑺𝒙𝒙 )(𝑺𝑺𝒚𝒚 )
𝟐𝟒𝟒. 𝟓𝟖𝟑
𝒓=
√(𝟒𝟔. 𝟗𝟏𝟔𝟕)(𝟏𝟔𝟗𝟎. 𝟗𝟏𝟔𝟕)
r = 0.868
Interpretation:
Since the value of r is greater than 0 or positive, weight y tends to increase as height x is
increases. The value 0.868 is near 1 so the linear relationship between height x and weight y
is strong.
Note: Some books use another formula in solving the correlation coefficient. However, for the
purpose of uniformity and to avoid confusion, let us just use the one above.
Example 2.
5 NegOr_Q4_Stat_and_Prob11_Module7_v2
Hours spent in studying(x) 1 3 5 2 4 3 2 0
Score(y) 70 79 90 77 85 81 75 64
Solution:
Step 1. Construct a table with the components x, y, x2, xy, y2 on top and with the corresponding
values as shown.
Hours spent
Score
No. in studying xy x² y²
(y)
(x)
1 1 70 70 1 4900
2 3 79 237 9 6241
3 5 90 450 25 8100
4 2 77 154 4 5929
5 4 85 340 16 7225
6 3 81 243 9 6561
7 2 75 150 4 5625
8 0 64 0 0 4096
Ʃ 20 621 1644 68 48677
𝟏 𝟐
𝑺𝑺𝒙𝒙 = ∑ 𝒙𝟐 − (∑ 𝒙)
𝒏
= 68 – (1/8)(20)2
= 68-50
= 18
𝟏
𝑺𝑺𝒙𝒚 = ∑ 𝒙𝒚 − (∑ 𝒙) (∑ 𝒚)
𝒏
= 1644 – (1/8)(20)(621)
= 1644-1552.50
= 91.5
𝟏
𝑺𝑺𝒚𝒚 = ∑ 𝒚𝟐 − 𝒏 (∑ 𝒚)𝟐
= 48677 – (1/8)(621)2
= 48677-48205.125
= 471.875
6 NegOr_Q4_Stat_and_Prob11_Module7_v2
Step 3. Compute the linear correlation coefficient r.
𝑺𝑺𝒙𝒚
𝒓=
√(𝑺𝑺𝒙𝒙 )(𝑺𝑺𝒚𝒚 )
𝟗𝟏. 𝟓
𝒓=
√(𝟏𝟖)(𝟒𝟕𝟏. 𝟖𝟕𝟓)
𝟗𝟏.𝟓
𝒓= ; r = 0.9928
√𝟖𝟒𝟗𝟑.𝟕𝟓
Interpretation:
Since, the value of r is positive and close to 1, the variables have a very strong
positive correlation. It means that students who took more hours in studying get
higher Physics score.
I Have Learned
Directions: Reflect the learning that you gained after taking up this lesson on Pearson’s
Correlation Coefficient by completing the given statements below. Do this on your activity
notebook. Do not write anything on this module.
What were your thoughts or ideas about the topic before taking up the lesson?
I thought that _______________________________________________________________
___________________________________________________________________________
__________________________________________________________________________.
What new or additional ideas have you had after taking up this lesson?
I learned that ________________________________________________________________
___________________________________________________________________________
__________________________________________________________________________.
How are you going to apply your learning from this lesson?
I will apply ________________________________________________________________
___________________________________________________________________________
_________________________________________________________________________.
7 NegOr_Q4_Stat_and_Prob11_Module7_v2
I Can Do
Directions: From a study with 6 patients, their ages and glucose levels were
recorded. Based on the data in the table below, would you say that ages (x) and
glucose levels (y) are linearly correlated. Complete the table by supplying the
necessary information.
Glucose
Patient Age(x) xy x² y²
Level(y)
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
Ʃ 247 486 20485 11409 40022
Answer the following:
a) n = _________
b) ∑ 𝒙𝟐 = _______
c) ∑ 𝒙 = ________
d) ∑ 𝒙𝒚 = ______
e) ∑ 𝒚 = _______
f) r = _________
g) Interpretation:
_____________________________________________________________________
_____________________________________________________________________
____________________________________________________________________.
Need
Excellent Good Satisfactory Improvement
Category 4 3 2 1
Completeness 100% of the 75% of the Only 50% of Only 25% of
necessary data necessary data the necessary the necessary
asked in the asked in the data asked in data asked in
task were task were the task were the task were
completely and followed and accomplished. accomplished.
correctly accomplished.
followed and
accomplished.
8 NegOr_Q4_Stat_and_Prob11_Module7_v2
Accuracy of answer The answer is The answer is The answer is The answer is
100% accurate. 75% accurate. 50% accurate. does not show
accuracy
Compute the linear correlation coefficient for the given data below.
2. The age x and resting heart rate y were measured for ten men, with the results shown in
the table below.
x 20 23 30 37 35 45 51 55 60 63
y 72 71 73 74 74 73 72 79 75 77
Compute the linear correlation coefficient for these sample data and interpret the
result.
9 NegOr_Q4_Stat_and_Prob11_Module7_v2
NegOr_Q4_Stat_and_Prob11_Module7_v2 10
a.
b. The y values tend to increase as x values increased, thus the relationship between x and y appears to be positive
linear.
WHAT I HAVE LEARNED
1. SSxy = 24 SSxx = 40 SSyy = 18.8 r = 0.875
2.
3. SSxy = 1761 – (1/6)(92)(110) = 74.33
SSxx = 1426 – (1/6)(92)2 = 15.33
SSyy = 2418 – (1/6)(110)2 = 401.33
r = 0.948
Since the value of r is greater than 0 or positive, the number of vocabulary y tends to increase as age x is increased.
The value 0.948 is near 1 so the linear relationship between age x and vocabulary y is strong.
WHAT I CAN DO (Performance Task)
Glucose
Patient Age(x) xy x² y²
Level(y)
1 43 99 4257 1849 9801
2 21 65 1365 441 4225
3 25 79 1975 625 6241
4 42 75 3150 1764 5625
5 57 87 4959 3249 7569
6 59 81 4779 3481 6561
Ʃ 247 486 20485 11409 40022
Assessment:
1. SSxy = 108 SSxx = 37.5 SSyy = 76.4 r = 0.638
2.
SSxy = 31244 – (1/10)(419)(740) = 238
SSxx = 19643 – (1/10)(419)2 = 2086.9
SSyy = 54814 – (1/10)(740)2 = 54
r = 0.709
Since the value of r is greater than 0 or positive, the resting heart rate y tends to increase as age x is increased. The
value 0.709 is near 1 so the linear relationship between age x and resting heart rate y is strong.
References
Malate, J., 2017. Statistics and Probability: The Pearson’s Correlation Coefficient. 155-159.
Sta. Ana, Manila: Vicarish Publications and Trading, Inc.
11 NegOr_Q4_Stat_and_Prob11_Module7_v2