Probability Distributions and Curve Fitting
Probability Distributions and Curve Fitting
S.No Content
4 Curve fitting by the Method of Least Squares, Fitting of Straight lines, Second
degree parabola, exponential and Growth curves.
Correlation The degree of relationship between two random variables X and Y
Positive or Direct The correlation between two variables such that the increase or
Correlation decrease in one variable results in the increase or decrease in the other,
that is, both the variables deviate in the same direction
Negative or The correlation between two variables such that the increase or
Diverse decrease in one variable results in the decrease or increase in the other,
Correlation that is, both the variables deviate in opposite direction
Uncorrelated Two variables such that the variation in one does not affect that of the
Variables other
cov( X , Y )
r= ,
x y
where
1
xi yi − X Y ,
1
2
Cov( X , Y ) = Standard deviation of x= x = xi − X
2
,
n n
1
2
Standard deviation of y = y = yi − Y ,
2
1 1
Mean of x = X =
n
xi and Mean of y = Y = yi
n
2
r = 1 or r = −1 Perfect Correlation
r =0 Uncorrelated Variables
3
Ex.1). Calculate the correlation coefficient between X and Y for the following.
X 1 3 4 5 7 8 10
Y 2 6 8 10 14 16 20
Solution:
cov( X , Y )
Correlation Coefficient r = ,
x y
where
1
xi yi − X Y ,
1 1
2 2
Cov( X , Y ) = x = xi − X
2
,y = yi − Y ,
2
n n n
1 1
X =
n
xi and Y = yi
n
X Y X2 Y2 XY
1 2 1 4 2
3 6 9 36 18
4 8 16 64 32
5 10 25 100 50
7 14 49 196 98
8 16 64 256 128
10 20 100 400 200
x =38 y =76 x
i i i
2
=264 y i
2
=1056 x y
i i =528
1 38
X =
n
xi =
7
= 5.4286
1 76
Y = yi = = 10.8571
n 7
1 1
2
x = xi − X =
2
(264) − (5.4286) 2 = 37.7143 − 29.4697 = 8.2446
n 7
x = 2.8713
1 1
2
y = yi − Y =
2
(1056) − (10.8571) 2 = 150.8571 − 117.8766 = 32.9805
n 7
y = 5.7429
4
1 1
Cov( X , Y ) =
n
xi yi − X Y = (528) − (5.4286 10.8571) = 16.4897
7
cov( X , Y ) 16.4897
r= = =1
x y 2.8713 5.7429
Ex.2). Calculate the correlation coefficient between X and Y for the following.
X -10 -5 0 5 10
Y 5 9 7 11 13
Solution:
cov( X , Y )
Correlation Coefficient r = ,
x y
where
1
xi yi − X Y ,
1 1
2 2
Cov( X , Y ) = x = xi − X
2
,y = yi − Y ,
2
n n n
1 1
X = xi and Y = yi
n n
X Y X2 Y2 XY
-10 5 100 25 -50
-5 9 25 81 -45
0 7 0 49 0
5 11 25 121 55
10 13 100 169 130
x =0i y =45 x
i i
2
=250 y i
2
=445 x y i i =90
1
X =
n
xi = 0
1 45
Y = yi = =9
n 5
1 1
2
x = xi − X =
2
(250) − (0) 2 = 50 = 7.0711
n 5
x = 7.0711
1 1
2
y = yi − Y =
2
(445) − (9) 2 = 89 − 81 = 2.8284
n 5
y = 2.8284
5
1 1
Cov( X , Y ) =
n
xi yi − X Y = (90) − (0) = 18
5
cov( X , Y ) 18
r= = = 0.9
x y 7.0711 2.8284
Regression:
Examples of regression of Y on x:
Y Given x
The concentration of a particular drug in Length of the time since the drug is
the blood stream of a patient given to the patient
Regression Equations:
y
y − y = r ( x − x ) .
x
Regression coefficient of Y on X = r y
x
The regression of X on Y is
x − x = r x ( y − y )
y
Regression coefficient of X onY = r x
y
Ex.1). Obtain
i) Regression equation of X on Y and
ii) Regression equation of Y on X for the following data.
X 1 3 4 5 7 8 10
Y 2 6 8 10 14 16 20
Solution:
cov( X , Y )
Correlation Coefficient r = ,
x y
where
1
Cov( X , Y ) =
n
xi yi − X Y ,
1
2
x = xi − X
2
n
7
1
2
y = yi − Y ,
2
n
1 1
X = xi and Y = yi
n n
X Y X2 Y2 XY
1 2 1 4 2
3 6 9 36 18
4 8 16 64 32
5 10 25 100 50
7 14 49 196 98
8 16 64 256 128
10 20 100 400 200
x =38 y =76 x
i i i
2
=264 y i
2
=1056 x y
i i =528
1 38
X =
n
xi =
7
= 5.4286
1 76
Y = yi = = 10.8571
n 7
1 1
2
x = xi − X =
2
(264) − (5.4286) 2 = 37.7143 − 29.4697 = 8.2446
n 7
x = 2.8713
1 1
2
y = yi − Y =
2
(1056) − (10.8571) 2 = 150.8571 − 117.8766 = 32.9805
n 7
y = 5.7429
1 1
Cov( X , Y ) =
n
xi yi − X Y = (528) − (5.4286 10.8571) = 16.4897
7
cov( X , Y ) 16.4897
r= = =1
x y 2.8713 5.7429
5.7429
y − 10.8571 = 1 ( x − 5.4286)
2.8713
y − 10.8571 = 2.0001( x − 5.4286)
y = 2.0001x − 0.0006
2) Find the most likely price in Mumbai corresponding to the price of Rs. 70 at Kolkata from
the following.
Kolkata Mumbai
Average price 65 67
Standard deviation 2.5 3.5
Correlation coefficient between the prices of commodities in the two cities is 0.8.
Solution:
Let the prices in Kolkata and Mumbai be denoted by X and Y respectively. Then we are
given:
3.5
When X=70 y = 67 + 0.8 (70 − 65) = 72.6
2.5
Hence the most likely price in Mumbai corresponding to the price of Rs. 70 at Kolkata
is 72.6.
9
3) Obtain the equations of two lines of regressions for the following data. Also obtain the
estimate height of father (X) for his son height Y=70.
Solution:
The regression line of Y on X is
y
y − y = r ( x − x ) .
x
The regression of X on Y is
x − x = r x ( y − y )
y
cov( X , Y )
Correlation Coefficient r = ,
x y
where
1
xi yi − X Y ,
1 1
2 2
Cov( X , Y ) = x = xi − X
2
,y = yi − Y ,
2
n n n
1 1
X = xi and Y = yi
n n
X Y X2 Y2 XY
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
xi =544 yi =552 x i
2
=37028 y i
2
=38132 xi yi =37560
1 544 1 552
X =
n
xi =
8
= 68 and Y = yi =
n 8
= 69
1 1
2
x = xi − X =
2
(37028) − (68) 2 = 2.1213
n 8
10
1 1
2
y = yi − Y =
2
(38132) − (69) 2 = 2.3452
n 8
1 1
Cov( X , Y ) =
n
xi yi − X Y = (37560) − (68 69) = 3
8
cov( X , Y ) 3
r= = = 0.603
x y 2.1213 2.3452
i) The regression line of Y on X is
y
y − y = r ( x − x ) .
x
2.3452
y − 69 = 0.603 ( x − 68) y = 69 + 0.6666( x − 68)
2.1213
y = 0.6666 x + 23.6712
x = 0.5454 y + 30.3674
Height of father (X) for his son height Y=70 is
4) Te data about the sales and advertisement expenditure of a firm is given below:
Sales in crores of Rs advertisement expenditure
crores of Rs
Means 40 6
Standard devations 10 1.5
Correlation Coefficient =r=0.9
i) Estimate the likely sales for a proposed advertisement expenditure of Rs. 10 crores
ii) What should be the advertisement expenditure if the firm proposes a sale target of
60 crores of rupees?
iii) Also find the regression coefficients.
Solution: Let the variable x denotes the sales (in crores of Rs) and the variable y denotes
the advertisement expenditure (in crores of Rs). Then we have the following
X = 40 , Y = 6 , x = 10 , y = 1.5 and r = 0.9
i)To estimate the likely sales X for a proposed advertisement expenditure Y of Rs. 10 crores
11
ii) To find the advertisement expenditure if the firm proposes a sale target of 60 crores of
rupees
The regression line of Y on X is
y
y − y = r ( x − x ) .
x
1 .5
y − 6 = 0.9 ( x − 40) y = 0.135( x − 40) + 6
10
y = 0.135 x + 0.6
Hence the advertisement expenditure if the firm proposes a sale target of 60 crores of rupees
is
y = 0.135(60) + 0.6 = 8.7 crores of rupees.
Regression coefficient of Y on X = r y =0.135
x
Regression coefficient of X onY = r x =6
y