# Linear Regression

Method of Least Squares

## Introduction

Suppose you conducted an experiment which resulted in a set of measured or otherwise observed values. Your physical model made you expect to find a certain type of relationship between the measured variables. However, due to errors or inaccuracies in the measurements or in your model, the observed values do not perfectly fit a relationship of the expected kind.

*Regression analysis* − or in case of an expected linear relation − *linear regression* (fig.1.)
provides a method to determine the best fit relation to the observed data.
An often used form of regression is the *method of least squares*.

The goal of the method of least squares in case of linear regression is to find the
parameters `a` and `b` of the best fit line
$y=ax+b$.

The deviation or "error" or *residual* `r`_{i} (fig.1.),
is the difference between an observed value and the value provided by the best fit relationship.
In case of linear regression we want to find a best fit line
$y=ax+b$ so:

$${r}_{i}={y}_{i}-\left(a{x}_{i}+b\right)$$

*Variance* is a useful concept to quantify how much a set of values fluctuates about its mean.
Variance is defined as:

$$\frac{1}{n}\sum _{i=1}^{n}{\left({x}_{i}-\stackrel{\u203e}{x}\right)}^{2}$$

Where
$\stackrel{\u203e}{x}$
represents the arithmetic mean of the set and `n` the number of elements in the set.

If we consider the set of residuals:

$$\left({y}_{1}-\left(a{x}_{1}+b\right),{y}_{2}-\left(a{x}_{2}+b\right),\dots ,{y}_{n}-\left(a{x}_{n}+b\right)\right)$$

The mean of this set must be zero, and therefor the variance is:

$$\frac{1}{n}\sum _{i=1}^{n}{\left({y}_{i}-\left(a{x}_{i}+b\right)\right)}^{2}$$

The best fit line is the line in which case the variance is minimal and so where

$${E}_{\left(a,b\right)}=\sum _{i=1}^{n}{\left({y}_{i}-\left(a{x}_{i}+b\right)\right)}^{2}=\sum _{i=1}^{n}{{r}_{i}}^{2}$$

is minimal.

Finding the minimum requires that the *gradient*
is zero, and hence that both partial derivatives with respect to parameters
`a` and `b` are zero:

$$\frac{\partial E}{\partial a}=0$$ and $$\frac{\partial E}{\partial b}=0$$

Applying the chain rule on each term of the sum:

$$\frac{\partial E}{\partial {r}_{1}}\frac{\partial {r}_{1}}{\partial a}+\frac{\partial E}{\partial {r}_{2}}\frac{\partial {r}_{2}}{\partial a}+\dots +\frac{\partial E}{\partial {r}_{n}}\frac{\partial {r}_{n}}{\partial a}=2{r}_{1}{x}_{1}+2{r}_{2}{x}_{2}+\dots +2{r}_{n}{x}_{n}=2\sum _{i=1}^{n}{r}_{i}{x}_{i}=0\iff \sum _{i=1}^{n}{r}_{i}{x}_{i}=0$$ and $$\frac{\partial E}{\partial {r}_{1}}\frac{\partial {r}_{1}}{\partial b}+\frac{\partial E}{\partial {r}_{2}}\frac{\partial {r}_{2}}{\partial b}+\dots +\frac{\partial E}{\partial {r}_{n}}\frac{\partial {r}_{n}}{\partial b}=2{r}_{1}\times 1+2{r}_{2}\times 1+\dots +2{r}_{n}\times 1=2\sum _{i=1}^{n}{r}_{i}=0\iff \sum _{i=1}^{n}{r}_{i}=0$$

Substitute [1] in both equations:

$$\sum _{i=1}^{n}\left({y}_{i}-a{x}_{i}-b\right){x}_{i}=\sum _{i=1}^{n}{x}_{i}{y}_{i}-a{{x}_{i}}^{2}-b{x}_{i}=0$$ and $$\sum _{i=1}^{n}{y}_{i}-a{x}_{i}-b=0$$

Which results in the so called *normal equations* (in shortened notation):

$$\sum xy-a\sum {x}^{2}-b\sum x=0$$

and$$\sum y-a\sum x-nb=0$$

Solving these normal equations for parameters `a` and `b` provides us
the equation for the best fit line:

From [3]:

$$b=\frac{1}{n}\sum y-a\frac{1}{n}\sum x$$

With $\stackrel{\u203e}{x}=\frac{1}{n}\sum x$ and $\stackrel{\u203e}{y}=\frac{1}{n}\sum y$ being arithmetic means:

$$b=\stackrel{\u203e}{y}-a\stackrel{\u203e}{x}$$

Substitute in [2]:

$$a=\frac{\stackrel{\u203e}{y}\sum x-\sum xy}{\stackrel{\u203e}{x}\sum x-\sum {x}^{2}}$$