USA: +1-585-535-1023

UK: +44-208-133-5697

AUS: +61-280-07-5697

Coefficient of Correlation

Karl Pearson has given a coefficient to measure the degree of linear relationship between two variables which is known as coefficient of correlation (or, correlation coefficient). For a bivariate distribution (x, y) the coefficient of correlation denoted by  rxy and is defined as

rxy =  Cov   (x,y) / σx . σy

where                          cov (x, y) = Covariance between x and y

σx = S.D. of x

σy = S.D. of y

For the values (x;, y;), i = 1, 2, ...... ,n of a bivariate distribution,

Limitations of rxy

 

  1. The coefficient of correlation can be used as a measure of linear relationship between two variables. In case of non-linear or any other relationship the coefficient of correlation does not provide any measure at all. So the inspection of scatter diagram is essential.
  2. Correlation must be used to the data drawn from the same source. If distinct sources are used then the two variables may show correlation but in each source they may be uncorrelated.
  3. For two variables with a positive or negative correlation it does not necessarily mean that there exists causal relationship. There may be the effect of some other variables in both of them. On elimination of this effect it may be found that the net correlation is nil.

 

Properties

  1. The coefficient of correlation is independent of the origin and scale of reference.
  2. -1 ≤  rxy  ≤ 1

 

Proof. Let      

Then,

Now   

2( 1- rxy ) ≥  0

rxy ≤ 1

1/n [∑(ui + vi)2]  ≥ 0

2 ( 1 + rxy )≥ 0

rxy  ≥ -1

Combining we obtain -1 ≤   r xy  ≤ 1.

  1. Two independent variables are uncorrelated (i.e. rxy = 0) but the converse is not always true.
  2. If  rxy is the correlation coefficient in a sample of n pairs of observations, then the standard error of  rxy is defined by

Probable error of the correlation coefficient is defined by

By Step Deviation Method

Let  dx = x - A,  dy = y - B which are the deviations and A, B are assumed values, then

where n = number of observations.

For grouped data,

 

Example 1. Calculate the Karl Pearson s coefficient of correlation of the following data :

Solution. Here x = 31, y = 25

The Karl Pearson's coefficient of correlation is given by

Example 2. Compute Karl Pearson s coefficient of correlation in the following series relating to price and supply of commodity ;

Solution. Computation table :

 

Example 3. For calculation of the correlation coefficient between the variables X and Y, the following information is obtained: n = 40, ∑X = 120, ∑X2 = 600, ∑Y = 90, ∑Y2 = 250, ∑XY = 356. It was, however, later discovered at the time of checking that it had copied down two pairs of observations as

 

X                 Y
9                  11
12                8

while the correct values were

X                 Y
8                  12
11                 9

 Obtain the correct value of the correlation coefficient between X and Y.

Solution. We have

Corrected ∑X = 120 - 9 - 12 + 8 + 11 = 118

Corrected ∑X2 = 600- 81 - 144 + 64 + 121 = 560

Corrected ∑Y = 90 - 11 - 8 + 12 + 9 = 92

Corrected Y2 = 250 - 121 - 64 + 144 + 81 = 290

Corrected ∑XY = 356 - 99 - 96 + 96 + 99 = 356

Correct value of correlation coefficient is given by