Fitting a straight line to bivariate data is known as linear regression and is particularly useful in finding a relationship between two variables.

## Fitting a linear line by eye

1. Plot the points
2. Fit a line of best fit by having the same number of points on each side of the line
3. Pick two points on your line and find the gradient (slope).
$m = \dfrac{y_2 - y_1}{x_2 - x_1}$
4. Pick one of the points and find the equation of the line:
$y = mx + c$

### Application

In this example, there are 5 data points above and below the line. The two points that could be used to find the gradient is (30, 25) and (60, 65). The gradient, m, is therefore 1.33. Using the point (30, 25) again, the c value is -15. The estimated line is therefore

$y = 1.33x - 15$

However, this method is not unique and is not easily reproduced. Because of this, it is preferred that a least square regression line is used.

## Least squares regression line

The least squares regression line is used to fit straight line to data. An alternative method is the three median regression line. This method is based on minimising the sum of the squared values of the residuals.

The equation of the least squares regression line is:

$y = a + bx$
where $b$ is the slope, given by   $b = \frac{rs_y}{s_x}$
and $a$ is the intercept, given by   $a = \bar{y}-b\bar{x}$
$r$ is the correlation coefficient
$s_x$ and $s_y$ are the standard deviations of x and y
$\bar{x}$ and $\bar{y}$ are the mean values of x and y

If this method is used using the data on the previous page for Test 1 and Test 2, the equation would be:

$y = 7.87x + 7.7$

This is a more accurate formula.