0% found this document useful (0 votes)
52 views

Probability Distributions and Curve Fitting

The document discusses different probability distributions and curve fitting techniques. It covers Poisson distribution, normal distribution, correlation, coefficient of correlation, lines of regression, and curve fitting using the method of least squares for straight lines, parabolas, exponential curves, and growth curves.

Uploaded by

Desktop Desktop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views

Probability Distributions and Curve Fitting

The document discusses different probability distributions and curve fitting techniques. It covers Poisson distribution, normal distribution, correlation, coefficient of correlation, lines of regression, and curve fitting using the method of least squares for straight lines, parabolas, exponential curves, and growth curves.

Uploaded by

Desktop Desktop
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 53

UNIT-V (Probability Distributions and Curve Fitting)

S.No Content

1 Poisson distribution, MGF and Cumulants of the Poisson distribution

2 Normal distribution, characteristics of Normal distribution MGF and CGF of


Normal distribution, Areas under normal curve.

3 Correlation, Coefficient of Correlation and Lines of Regression.

4 Curve fitting by the Method of Least Squares, Fitting of Straight lines, Second
degree parabola, exponential and Growth curves.
Correlation The degree of relationship between two random variables X and Y

Positive or Direct The correlation between two variables such that the increase or
Correlation decrease in one variable results in the increase or decrease in the other,
that is, both the variables deviate in the same direction

Ex. Income and expenditure.

Negative or The correlation between two variables such that the increase or
Diverse decrease in one variable results in the decrease or increase in the other,
Correlation that is, both the variables deviate in opposite direction

Ex. Price and demand of a commodity

Uncorrelated Two variables such that the variation in one does not affect that of the
Variables other

Correlation The numerical quantity which is a measure of correlation between two


Coefficient random variables, which is determined from the bivariate frequency
distribution of the observed data

Correlation Coefficient always lies in between (-1 , 1)

Formula for the Correlation Coefficient

Consider the paired data of observations ( xi , y i ) , i = 1, 2,  , n . Then the sample correlation


coefficient is given by:

cov( X , Y )
r= ,
 x y
where
1
 xi yi − X Y ,
1

2
Cov( X , Y ) = Standard deviation of x=  x = xi − X
2
,
n n
1

2
Standard deviation of y =  y = yi − Y ,
2

1 1
Mean of x = X =
n
 xi and Mean of y = Y =  yi
n
2

Correlation Coefficient and Type of Correlation

r 0 Positive (or Direct) Correlation

r 0 Negative (or Diverse) Correlation

r = 1 or r = −1 Perfect Correlation

r =1 Perfect Positive Correlation

r = −1 Perfect Negative Correlation

r =0 Uncorrelated Variables
3

Ex.1). Calculate the correlation coefficient between X and Y for the following.

X 1 3 4 5 7 8 10
Y 2 6 8 10 14 16 20

Solution:
cov( X , Y )
Correlation Coefficient r = ,
 x y
where
1
 xi yi − X Y ,
1 1
 
2 2
Cov( X , Y ) = x = xi − X
2
,y = yi − Y ,
2

n n n

1 1
X =
n
 xi and Y =  yi
n

X Y X2 Y2 XY
1 2 1 4 2
3 6 9 36 18
4 8 16 64 32
5 10 25 100 50
7 14 49 196 98
8 16 64 256 128
10 20 100 400 200

 x =38  y =76  x
i i i
2
=264 y i
2
=1056 x y
i i =528

1 38
X =
n
 xi =
7
= 5.4286
1 76
Y =  yi = = 10.8571
n 7
1 1

2
x = xi − X =
2
(264) − (5.4286) 2 = 37.7143 − 29.4697 = 8.2446
n 7

 x = 2.8713

1 1

2
y = yi − Y =
2
(1056) − (10.8571) 2 = 150.8571 − 117.8766 = 32.9805
n 7

 y = 5.7429
4

1 1
Cov( X , Y ) =
n
 xi yi − X Y = (528) − (5.4286  10.8571) = 16.4897
7
cov( X , Y ) 16.4897
r= = =1
 x y 2.8713  5.7429

Ex.2). Calculate the correlation coefficient between X and Y for the following.

X -10 -5 0 5 10
Y 5 9 7 11 13

Solution:
cov( X , Y )
Correlation Coefficient r = ,
 x y
where
1
 xi yi − X Y ,
1 1
 
2 2
Cov( X , Y ) = x = xi − X
2
,y = yi − Y ,
2

n n n
1 1
X =  xi and Y =  yi
n n

X Y X2 Y2 XY
-10 5 100 25 -50
-5 9 25 81 -45
0 7 0 49 0
5 11 25 121 55
10 13 100 169 130
 x =0i  y =45  x
i i
2
=250 y i
2
=445 x y i i =90

1
X =
n
 xi = 0
1 45
Y =  yi = =9
n 5
1 1

2
x = xi − X =
2
(250) − (0) 2 = 50 = 7.0711
n 5

 x = 7.0711

1 1

2
y = yi − Y =
2
(445) − (9) 2 = 89 − 81 = 2.8284
n 5

 y = 2.8284
5

1 1
Cov( X , Y ) =
n
 xi yi − X Y = (90) − (0) = 18
5
cov( X , Y ) 18
r= = = 0.9
 x y 7.0711  2.8284

Regression:

Regressions. Consider a bi-variate data ( X i , Yi ), i = 1, 2, ..., n , of variables X and Y.

 If X is independent variable, we can estimate the average values of Y for a given


value x of X. A functional relation of the form:
mean (Y) = f (x ) , for each value x of X,
is known as a regression of Y on X
.
 If Y is independent variable, we can estimate the average values of X for a given
value y of Y. A functional relation of the form:
mean (X) = g( y ) , for each value y of Y,
is known as a regression of X on Y.

Examples of regression of Y on x:

Y Given x

The blood pressure of a person Age of the person

The concentration of a particular drug in Length of the time since the drug is
the blood stream of a patient given to the patient

Weight of a person The number of days he or she has been


on deit
Potential number of sales of a product Price of the product

Monthly expenditure of a family on The family income


entertainment
6

Regression Equations:

The regression line of Y on X is

 y 
y − y = r  ( x − x ) .
x 
 
Regression coefficient of Y on X = r  y 
x 
The regression of X on Y is

 
x − x = r x ( y − y )
y 
 
 
Regression coefficient of X onY = r  x 
 
 y 

Ex.1). Obtain
i) Regression equation of X on Y and
ii) Regression equation of Y on X for the following data.

X 1 3 4 5 7 8 10
Y 2 6 8 10 14 16 20

Solution:

The regression line of Y on X is


y 
y − y = r  ( x − x ) .

 x 
The regression of X on Y is
 
x − x = r x ( y − y )
y 
 

cov( X , Y )
Correlation Coefficient r = ,
 x y
where

1
Cov( X , Y ) =
n
 xi yi − X Y ,
1

2
x = xi − X
2

n
7

1

2
y = yi − Y ,
2

n
1 1
X =  xi and Y =  yi
n n

X Y X2 Y2 XY
1 2 1 4 2
3 6 9 36 18
4 8 16 64 32
5 10 25 100 50
7 14 49 196 98
8 16 64 256 128
10 20 100 400 200

 x =38  y =76  x
i i i
2
=264 y i
2
=1056 x y
i i =528

1 38
X =
n
 xi =
7
= 5.4286
1 76
Y =  yi = = 10.8571
n 7
1 1

2
x = xi − X =
2
(264) − (5.4286) 2 = 37.7143 − 29.4697 = 8.2446
n 7

 x = 2.8713

1 1

2
y = yi − Y =
2
(1056) − (10.8571) 2 = 150.8571 − 117.8766 = 32.9805
n 7

 y = 5.7429

1 1
Cov( X , Y ) =
n
 xi yi − X Y = (528) − (5.4286  10.8571) = 16.4897
7
cov( X , Y ) 16.4897
r= = =1
 x y 2.8713  5.7429

i) The regression line of Y on X is


y 
y − y = r  ( x − x ) .

 x 
8

 5.7429 
y − 10.8571 = 1 ( x − 5.4286)
 2.8713 
y − 10.8571 = 2.0001( x − 5.4286)
y = 2.0001x − 0.0006

ii) The regression line of X on Y is


 
x − x = r x ( y − y ) .
y 
 
 2.8713 
x − 5.4286 = 1 ( y − 10.8571)
 5.7429 
x − 5.4286 = 0.5( y − 10.8571)
x = 0.5 y + 0.0001

2) Find the most likely price in Mumbai corresponding to the price of Rs. 70 at Kolkata from
the following.
Kolkata Mumbai
Average price 65 67
Standard deviation 2.5 3.5
Correlation coefficient between the prices of commodities in the two cities is 0.8.

Solution:
Let the prices in Kolkata and Mumbai be denoted by X and Y respectively. Then we are
given:

X = 65 , y = 67 ,  x = 2.5 ,  y = 3.5 and r = 0.8

Now we have to calculate Y when X=70.


Regression equation of Y on X is:
y 
y − y = r  ( x − x )

 x 
 3.5   3.5 
y − 67 = 0.8 ( x − 65)  y = 67 + 0.8 ( x − 65)
 2.5   2.5 

 3.5 
When X=70  y = 67 + 0.8 (70 − 65) = 72.6
 2.5 

Hence the most likely price in Mumbai corresponding to the price of Rs. 70 at Kolkata
is 72.6.
9

3) Obtain the equations of two lines of regressions for the following data. Also obtain the
estimate height of father (X) for his son height Y=70.

fathers height (X) 65 66 67 67 68 69 70 72


Sons height (Y) 67 68 65 68 72 72 69 71

Solution:
The regression line of Y on X is
y 
y − y = r  ( x − x ) .

 x 
The regression of X on Y is
 
x − x = r x ( y − y )
y 
 
cov( X , Y )
Correlation Coefficient r = ,
 x y
where
1
 xi yi − X Y ,
1 1
 
2 2
Cov( X , Y ) = x = xi − X
2
,y = yi − Y ,
2

n n n
1 1
X =  xi and Y =  yi
n n
X Y X2 Y2 XY
65 67 4225 4489 4355
66 68 4356 4624 4488
67 65 4489 4225 4355
67 68 4489 4624 4556
68 72 4624 5184 4896
69 72 4761 5184 4968
70 69 4900 4761 4830
72 71 5184 5041 5112
 xi =544  yi =552 x i
2
=37028 y i
2
=38132  xi yi =37560

1 544 1 552
X =
n
 xi =
8
= 68 and Y =  yi =
n 8
= 69

1 1

2
x = xi − X =
2
(37028) − (68) 2 = 2.1213
n 8
10

1 1

2
y = yi − Y =
2
(38132) − (69) 2 = 2.3452
n 8

1 1
Cov( X , Y ) =
n
 xi yi − X Y = (37560) − (68  69) = 3
8
cov( X , Y ) 3
r= = = 0.603
 x y 2.1213  2.3452
i) The regression line of Y on X is
y 
y − y = r  ( x − x ) .

 x 
 2.3452 
y − 69 = 0.603 ( x − 68)  y = 69 + 0.6666( x − 68)
 2.1213 

y = 0.6666 x + 23.6712

ii) The regression line of X on Y is


 
x − x = r x ( y − y ) .
y 
 
 2.1213 
( x − 68) = 0.603 ( y − 69) x = 68 + 0.5454( y − 69)
 2.3452 

x = 0.5454 y + 30.3674
Height of father (X) for his son height Y=70 is

x = 0.5454(70) + 30.3674 = 68.5454

4) Te data about the sales and advertisement expenditure of a firm is given below:
Sales in crores of Rs advertisement expenditure
crores of Rs
Means 40 6
Standard devations 10 1.5
Correlation Coefficient =r=0.9
i) Estimate the likely sales for a proposed advertisement expenditure of Rs. 10 crores
ii) What should be the advertisement expenditure if the firm proposes a sale target of
60 crores of rupees?
iii) Also find the regression coefficients.

Solution: Let the variable x denotes the sales (in crores of Rs) and the variable y denotes
the advertisement expenditure (in crores of Rs). Then we have the following
X = 40 , Y = 6 ,  x = 10 ,  y = 1.5 and r = 0.9

i)To estimate the likely sales X for a proposed advertisement expenditure Y of Rs. 10 crores
11

The regression line of X on Y is


 
x − x = r x ( y − y ) .
y 
 
 10 
( x − 40) = 0.9 ( y − 6)  x = 40 + 6( y − 6)
 1 .5 
 x = 6y + 4
Hence the estimated likely sales for a proposed advertisement expenditure of Rs. 10 crores
Is
 x = 6 10 + 4 = 64 crores of rupees.

ii) To find the advertisement expenditure if the firm proposes a sale target of 60 crores of
rupees
The regression line of Y on X is
y 
y − y = r  ( x − x ) .

 x 
 1 .5 
y − 6 = 0.9 ( x − 40)  y = 0.135( x − 40) + 6
 10 
y = 0.135 x + 0.6
Hence the advertisement expenditure if the firm proposes a sale target of 60 crores of rupees
is
y = 0.135(60) + 0.6 = 8.7 crores of rupees.
 
Regression coefficient of Y on X = r  y  =0.135
x 

 
Regression coefficient of X onY = r  x  =6
 
 y 

You might also like