0% found this document useful (0 votes)
402 views

Correlation Analysis Notes-2

This document provides an overview of correlation analysis including: - Defining correlation as a measure of the relationship between two variables. - Discussing the Pearson correlation coefficient and its properties. - Describing types of correlation like positive, negative, simple, and multiple. - Explaining formulas and interpreting the correlation coefficient value. - Noting that correlation indicates relationship but not causation.

Uploaded by

Kotresh Kp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
402 views

Correlation Analysis Notes-2

This document provides an overview of correlation analysis including: - Defining correlation as a measure of the relationship between two variables. - Discussing the Pearson correlation coefficient and its properties. - Describing types of correlation like positive, negative, simple, and multiple. - Explaining formulas and interpreting the correlation coefficient value. - Noting that correlation indicates relationship but not causation.

Uploaded by

Kotresh Kp
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

EDUCATION FOR EXCELLENCE

Statistical Methods
Study Material – B.Com VI Semester
Prepared By:
SHANKARAIAH.P
Lecturer
Module: 1
CORRELATION ANALYSIS
Correlation between groups of data is a measure of how well they are related. The most
common measure of correlation in stats is the Pearson Correlation. The full name is
the Pearson Product Moment Correlation. It shows the linear relationship between two
groups of data. Two letters are used to represent the correlation: Greek letter rho (ρ) for a
population and the letter “r” for a sample. It is the most popular and widely used method to
calculate the correlation coefficient.
Karl Pearson, a reputed statistician, in 1890, had constructed a well set formula based
on mathematical treatment for determining the co-efficient of correlation. It is method of
calculating coefficient of correlation is based on covariance of the concerned variables.
Covariance is a statistical representation of the degree to which two variables vary together.
Definition:
“It is the study of degree of relationship between two or more variables”.
According to A.M. Tuttle, “Correlation is an analysis of the co-variation between two
or more variables.”
Importance of correlation:
1. The correlation coefficient helps in measuring the extent of relationship between two
variables in one figure.
2. Correlation analysis facilitates understanding of economic behaviour and helps in
locating the critically important variables on which others depend.
3. When two variables are correlated, then value of one variable can be estimated, given
the value of another.
4. Correlation facilitates the decision-making in the business world. It reduces the range
of uncertainty as predictions based on correlation are likely to be more reliable and closer to
reality.
5. The estimations made on the basis of correlation analysis are considered to be nearer
to reality and hence reliable.
Therefore, correlation analysis contributes to the understanding of economic behaviour,
and helps in locating the critically important variables on which others depend.
Types of Correlation:
1. Positive Correlation: When an increase in one variable results in an increase in another
variable or decrease in one variable results an decrease in another variable, then the
relationship is said to be positive. That is, when both the variables move in the same direction
is called positive correlation.
Examples: Relationship between height and weight, price and quantity supplied of a good etc.
2. Negative Correlation: When an increase in one variable results in a decrease in another
variable or a decrease in one variable results in increase in another variable, then the
relationship is said to be negative. That is, when both the variables move in the opposite
direction is called negative correlation.
Examples: Relationship between price and quantity demanded of a good, day temperature and
sale of woolen garments, etc.
3. Simple Correlation: When study involves relationship between only two variables is called
simple correlation.
Examples: study of relationship between the two variables such demand and price.

Shankaraiah.P, Lecturer, R.G. Institute Of Commerce And Management, Davangere. 2


4. Multiple Correlation: When study involves relationship between more than two variables
is called multiple correlation.
Examples: Study of relationship between the sales and other influences on sales such as, price,
advertisement, quality, quantity, etc.
5. Linear Correlation: When a constant ration of change in one variable results in a constant
ratio of change in another variable is said to be linear correlation. It is also called as Strait Line
Correlation, as the plotted values of pair observations on a graph gives a straight line.
Example:
X Variable 10 20 30 40 50
Y Variable 50 100 150 200 150

X
6. Non-Linear Correlation: When a constant ration of change in one variable will not results
in a constant ratio of change in another variable is said to be non-linear correlation. It is also
called as Curvi-Line Correlation as plotted values of pair observations on a graph gives a
curved line.
Example:
X Variable 10 20 30 40 50
Y Variable 12 18 22 8 5

X
Assumptions:
1. Linear Relationship: In this method, it is assumed that there is a linear relationship
between the two variables.
2. Normal Distribution: It assumes the normal distribution of sample population.
3. Cause and Effect: It assumes the existence between the cause and effect relationship
between the variables.
Properties of Co-efficient of Correlation:
1. It is a relative measure correlation. Since, calculated statistical value stated in
proportion.
2. It is independent of unit of measurement.
3. It is independent of change of origin and change of scale.

Shankaraiah.P, Lecturer, R.G. Institute Of Commerce And Management, Davangere. 3


4. The value of ‘r’ always lies between –1 and +1.
5. It studies the degree of relationship between variables and direction of the variables.
6. It is the geometric mean of the two regression co-efficient bxy and byx. That is,

r = bxy X byx.
7. Co-efficient of correlation works both ways: rxy and rys
Formulas:
I. Individual Observations:
1. Direct Method: When deviations are taken from Actual Mean (X – X )

∑ 𝐝𝐱𝐝𝐲
𝐫=
√∑ 𝐝𝐱 𝟐 𝐗 ∑ 𝐝𝐲 𝟐

2. Short-cut Method: When deviations are taken from Assumed Mean (X – A )

∑ 𝐝𝐱𝐝𝐲 − (∑ 𝒅𝒙)(∑ 𝒅𝒚)


𝐫=
√∑ 𝐝𝐱 𝟐 − (∑ 𝒅𝒙)𝟐 𝐗 ∑ 𝐝𝐲 𝟐 − (∑ 𝒅𝒚)𝟐

II. Grouped Data


X−A
3. Step-deviation Method: When deviations are taken from Assumed Mean (d = C )

∑ 𝐟𝐝𝐱𝐝𝐲 − (∑ 𝒇𝒅𝒙)(∑ 𝒇𝒅𝒚)


𝐫=
√∑ 𝐟𝐝𝐱 𝟐 − (∑ 𝒇𝒅𝒙)𝟐 𝐗 ∑ 𝐟𝐝𝐲 𝟐 − (∑ 𝒇𝒅𝒚)𝟐

Interpretation of the value of ‘r’:


The value of ‘r’ is to be interpreted as bellow:
1. When the value of ‘r’ is + 1 prefect positive correlation.
2. When the value of ‘r’ is – 1 prefect negative correlation.
3. When the value of ‘r’ is + 0.5 moderate positive correlation.
4. When the value of ‘r’ is – 0.5 moderate negative correlation.
5. When the value of ‘r’ is 0 (zero) there is no correlation.
6. When the value of ‘r’ is When the value of ‘r’ is more than + 0.75 then, there is a high
degree positive relationship between the variables.
7. When the value of ‘r’ is more than 0.75 then, there is a high degree relationship
between the variables.
8. When the value of ‘r’ is less than 0.25 then, there is a low degree relationship between
the variables.

Correlation and Causation:


Correlation studies and measures the degree of relationship between two or more
variables. It does not tell us anything about cause-and-effect relationship between variables. If
there is correlation between two variables, it may be due to the following reasons:

Shankaraiah.P, Lecturer, R.G. Institute Of Commerce And Management, Davangere. 4


1. Both variables being influenced by a third variable: It is possible that a high degree of
correlation between the two variables maybe due to the influence of a third variable not
included in the analysis.
2. Mutual dependence: When two variables shows a high degree of correlation, then it may
be difficult to explain from the two correlated variables, which is the cause, and which is the
effect because both may be reacting on each other.
3. Pure chance: The correlation between the two variables may be obtained due to pure
chance.
Merits:
1. It is based on all the observations.
2. It facilitates cooperation.
3. It measures the degree and direction of the moment between the variables.
4. It provides the numerical measurement and based on set of mathematical equation.
Limitations:
1. It assumes the linear relationship between the variables which impractical.
2. It is not so easy to calculate and time consuming.
3. It is affected by the extreme observations.
4. There is a chance of mis-interpretation of the value of ‘r’.
Probable Error:
Probable Error is an instrument which measures the reliability and dependability of the
value of ‘r’, the Karl Pearson’s co-efficient of correlation. The probable error of co-efficient of
correlation helps in interpreting its value. The co-efficient of correlation is generally computed
from samples, which are subject to errors of sampling. From the interpretation point of view
Probable Error is very useful.
Definition of Probable Error:
According to Horace Secrist: “The Probable Error of ‘r’ is an amount which if added
and subtracted from the average correlation co-efficient produce the two limits, within which
the parameter value lies”.
Formula:
𝟏 − 𝐫𝟐
𝐏𝐄(𝐫) = 𝟎. 𝟔𝟕𝟒𝟓
√𝐍
The Probable Error of co-efficient of correlation determines the two limits within
which, co-efficient of correlation of randomly selected samples from the same universe will
fall.
Interpretation of Value of Probable Error:
1. If the value of ‘r’ is less than its Probable Error, then there is no evidence of correlation.
2. If the value of ‘r’ is more than the six times of its Probable Error, then, the relationship
between the variables is more significant.
3. If the value of ‘r’ is less than the six times of its Probable Error, then, the relationship
between the variables is more significant.

Shankaraiah.P, Lecturer, R.G. Institute Of Commerce And Management, Davangere. 5

You might also like