You are about to erase your work on this activity. Are you sure you want to do this?
Updated Version Available
There is an updated version of this activity. If you update to the most recent version of this activity, then your current progress on this activity will be erased. Regardless, your record of completion will remain. How would you like to proceed?
Mathematical Expression Editor
Curve Fitting
We know that two points determine a line. Do you know how many points determine
a quadratic function of the form ? Given any number of points in the plane, is it
always possible to find a polynomial function whose graph contains every one of the
given points? To address these questions we will start with an alternative way of
finding an equation of a line.
Consider two points and . We will find a function whose graph is a line that passes
through these points. We know that for some constants and . Because the graph of
passes through and , we must have the following:
To solve for and , we need to solve the following matrix equation:
Solving the equation, we find that and . This gives us:
The GeoGebra interactive below shows two points and , together with the matrix
equation that produces function coefficients for the function whose graph passes
through and . Drag the points around the plane to see how the matrix equation
changes.
From a purely formal standpoint, we observe that the matrix equation has the
form:
where each row corresponds to one point.
Now we are ready to move to quadratic, and higher degree polynomial functions.
Linear function in Exploration exp:curveFitLine had two unknown coefficients that we needed to find
in order to determine the function. Two points gave us a system of two equations and
two unknowns.
A quadratic polynomial function, whose graph is a parabola, is given by:
Three unknown coefficients will require three points to determine them.
We will find a quadratic function of the form whose graph passes through , and .
To do this, we need to find coefficients , and such that
The following GeoGebra interactive shows points , , and , together with the matrix
equation, and its solution.
Drag the points around the plane to observe changes in the coefficient matrix. Think
geometrically to find locations of , and such that
; .
; .
Observe the structure of the matrix equation.
In general (provided that no one point lies directly above another), given points, we
can always find an -degree polynomial function whose graph contains every one of the
given points. To find such a polynomial function, given by , we need to solve a system
of equations with unknowns which translates into the following matrix
equation.
In Practice Problems prob:systemProblems1 and prob:systemProblems2 you will show that the matrix equation in (eq:matEq) has a unique
solution if and only if no two of the given points share an -coordinate.
Using Technology
Throughout this section we have omitted the tedious process of solving matrix
equations. It is useful to practice solving smaller matrix equations by hand, but for
larger matrices, we can use technology. Below is the Octave code that can be used to
find the solution to Exploration exp:curveFitParabola. You will be able to modify this code to solve some
of the practice problems.
To use Octave, go to the Sage Math Cell Webpage, copy the code below into the cell,
select OCTAVE as the language, and press EVALUATE.
% Define the coefficient matrix
A=[4 -2 1;0 0 1;1 1 1];
% Define vector b
b=[2;-1;5];
% We can find the solution in two ways
% Method 1: ans1=A^(-1)b
ans1=inv(A)*b
% Method 2
ans2=A\b
% If A is invertible, both methods produce the same result.
On the Dangers of Overfitting
It is exciting to know that we can fit a function to a set of data points, but before we
get carried away fitting a 299-degree polynomial function to 300 points, let’s consider
the following situation.
In the GeoGebra interactive below, you can see that points - form a somewhat
linear pattern. A linear model can be used to describe these points. Click on the
“Display linear model” check-box to see the trend line. (You will learn how to find
such models in Least-Squares Approximation). You can see that even though the line
does not pass through any of the given points, it fits the overall pattern of the points
and can be used to estimate the -coordinates of other points whose -coordinates fall
within the limits of the scatter plot.
It might be tempting to think that we can find a better model by finding a -degree
polynomial function whose graph contains every one of the six points. Click on the
“Display 5th degree poly model” check-box to see the alternative model. Can this
model be successfully used to make predictions?
Try moving individual points around to see how their placement affects the line and
the curve.
Any modeling process which insists on fitting the existing data points exactly, at the
risk of failing to predict future observations, is referred to as overfitting. While
sometimes it is beneficial to have a curve that passes through specific points, more
often it is the trend, not the individual instances, that we try to capture. We will
return to this topic in Least-Squares Approximation.
Practice Problems
In each case, find a polynomial function of an appropriate degree that passes
through the given points.
Plot the graph of in the Desmos window below.
Plot the graph of in the Desmos window below.
Two GeoGebra screenshots are shown below:
In the first screenshot, points and coincide. In the second screenshot, point is
located directly above point . In both cases, GeoGebra failed to produce a linear
function whose graph passes through and .
Based on what you know about functions and geometry, explain why the process fails
for these two examples. How do your observations correspond to what happens from
an algebraic standpoint?
Both systems are inconsistent.The first system is inconsistent, the second has
infinitely many solutions.Both systems have infinitely many solutions.The first
system has infinitely many solutions, the second system is inconsistent.
Prove that equation (eq:matEq) has a unique solution if and only if no two given points share
an -coordinate.
Show that the rows of the matrix are linearly independent if and
only if no two given points share an -coordinate.
Under what circumstances is a solution not unique? Under what circumstances does
a solution not exist?