We suppose given points in the plane , with distinct -coordinates (in practice, such sets of points can arise as data based on the measurement of some quantity - recorded as the -coordinate - as a function of some parameter recorded as the -coordinate). Then we would like to find the equation of the line that best fits these points (by exactly what measurement the line represents a best possible fit is explained below). if we write the equation of the line as for indeterminants , then what we are looking for is a least-squares solution to the system of equations

Note that, in this system, the and are constants, and we are trying to solve for and . For there will be a solution, but in the overdetermined case there almost always fails to be one. Hence the need to work in the least-squares setting.

The last computation in this example indicates what is being minimized when one fits data points in this way.

The setup above provides a method for finding not just linear approximations, but higher order ones as well. The linear algebra is essentially the same. To illustrate,

We will illustrate our final point by looking at what happens if we go one degree higher.

This set of examples, in which we compute successively higher order approximations to a set of data points until we finally arrive at an exact fit, is part of a more general phenomenon, which we record without proof by the following theorem.