USA: +1-585-535-1023

UK: +44-208-133-5697

AUS: +61-280-07-5697

Regression Equations


In regression analysis we can predict or estimate the value of one variable with the help of the value of other variable of the distribution after fitting to an equation. Hence there are two regression equations. The regression equation of Y on X is used to predict the value of Y with the value of X, whereas the regression equation of X on Y is used to predict the value of X with the value of Y. Here the independent variable is called predictor or explanatory or repression and the dependent variable is called explained or regressed variable.

When these regression equations are straight lines then they are called regression lines.

(a) Regression Line of Y on X. Y = a + bX

Let (x1,y 1), (x2, y2), .. . ...... ,(xn, yn) be the given observations using the method of least square, we can estimate the values of a and b and the resultant equation takes the form

(b) Regression Line of X on Y: X =, c + dY

Similarly we obtain,

(c) Properties


  1. Both the regression lines pass through the mean values (.X, Y).
  2. bxy.byx = r 2

and the sign of r is the same as of regression coefficients.

  1. The two regression equations are different, unless r = ± 1, in which case the two equations are identical.

The angle between the regression lines is given by,


(d) Formulas for Regression Coefficients.           

  1. For, X = x- .X, Y = y - y,

2. Generally

where n = no. of observations.



4.   for grouped data

Where N = f.

(e) Standard Error of Estimates.

Consider the regression equation of X on Y. Then the root mean square deviation of the points from the regression line of X on Y is called the standard error of estimate of X which is given by

Similarly, the standard error of estimate of Y from the regression equation Y on X is

The standard error of estimate serves a standard deviation of the size of the error of the predicted values of Y (from the equation Y on X) and of X (from the equation X on Y). The size of the standard error also helps up to assess the quality of our regression model.

(f) Coefficient of Determination.

This gives the percentage variation in the dependent variable that is accounted for or explained by the independent variable is given by

Coefficient of determination, R2 = Explained variance / Total varience

Let Y be the dependent variable and X be the independent variable. If R2 = 0.85 then we shall be able to reduce or explain 85% of the variation in Y with a knowledge of X.

If Y; = Go + al X; (i = 1, 2, ...... n,) be the fitted values to the observation (xi , yi) i = 1, 2 ... ... n, then

R 2 = 1 > all n observations lie on the fitted regression line.


Example 4. From the following data obtain the two regression lines and the correlation coefficient :

Find the value of y when x = 82.

Solution. Here ∑x = 644,  ∑y = 567, n = 7

The regression coefficients are

Regression equation of y on x :

y-y’ = byx(x-x’)

y-81 = 0.84(x-92)

y=0.84x+ 3.72

Regression equation of x on y

x-x’ = bxy(y-y’)

x-92 = 1.12(y-81)

x=1.12y+ 1.28

The coefficient of corelation



For x = 82, the value of y to be obtained from the regression equation of y on x. Hence

y = (0.84) (82) + 3.72 = 72.6.


Example 5. Consider the two regression lines : 3X + 2Y = 26 and 6X + Y = 31, (a) Find the mean value and correlation coefficient between X and .Y. (b) If the variance of Y is 4, find the S.D  of X

Solution. (a) Intersection of two regression lines gives the mean value i.e., (X, Y).

Solving the two equations, we obtain X = 4 and Y = 7.

Let 3X + 2Y = 26 be the regression line of X on Y and the other line as Y on X.


    => byx = -2/3

Y=-6x+ 31 ( X on Y)   =>  byx = -1/6

But r2 = bxy. byx = 4 which cannot be true.

So we change our assumptions i.e., the line 3X + 2Y = 26 represents Y on X and the other line as X on Y.


(b) Given σy2 = 4    =>  σy = 2