We introduce a way of analyzing the rate of change in a given direction.
For functions of several variables, partial derivatives measure the rate of change when changing only one of the inputs. We can think of partial derivatives geometrically if we consider the surface \(z=F(x,y)\). Let’s imagine that our surface is a hill and consider \(F_y(a,b)\). This tell us we should hold \(x\) constant and see how \(z= F(x,y)\) changes as \(y\) changes. In essence, we imagine a path “parallel” to the \(y\)-axis and note how \(z=F(x,y)\) changes:
We can now interpret \(F_y(a,b)\) as either:
- the slope of the hill if we walk along it in a direction parallel to the \(y\)-axis.
- the instantaneous rate of change of \(F(x,y)\) at \((a,b)\) as we approach \((a,b)\) along the line \(x=a\).
We have a similar interpretation of \(F_x(a,b)\). However, there is no reason that we must approach \((a,b)\) along a line that is parallel to one of the coordinate axes. What if we want to approach along any line? Consider, for example, this line:
Indeed, once we are at a point on the surface above \(\vec {a}=\vector {a,b}\), there are actually many different directions that we can travel along the hill. Let’s consider a line \(\vecl \) that passes through \(\vec {a}\) as our path in the domain. Our rate of change will be given by
where the “run” is the distance traveled along the line \(\vecl \) and the “rise” is the corresponding change in the \(z\)-values of the function. Since we are ultimately concerned about a curve on the surface, a good first step is to parameterize the line in the domain, then use the function to find a parametric description of the curve on the surface above \(\vecl \).
In order to make computing the run most efficiently, we pick a unit vector \(\uvec {u}\) in the direction \(\vecl \) is drawn. We’ll see how to do this in the next example, but we can always start at \(\vec {a}\) and draw a unit vector that extends from \(\vec {a}\) along \(\vecl \).
To find a parameterization of \(\vecl \), note that \(\uvec {u}\) is parallel to \(l\) and \(\vec {a}\) is a point on the line, so letting \(h\) denote the parameter, a description of \(\vecl \) is given by
For the sake of example, let \(h>0\) (a similar argument can be given if \(h<0\)). One convenient consequence of using a unit vector in the direction of \(\vecl \) is that the “run,” which is the distance between \(\vec {a}\) and \(\vec {a}+h\uvec {u}\) is simply
since \(h>0\) and \(\uvec {u}\) is a unit vector.
The “rise” is computed by noting that it is the corresponding change in \(z\)-values.
These are shown in the image below.
To find the instantaneous rate of change, we take the limit as \(h\) goes to \(0\) (since \(F\) is differentiable, it can be shown this limit must exist). We call the result the directional derivative of \(F\) at \(\vec {a}\) in the direction \(\uvec {u}\) and will henceforth denote this by \(D_{\uvec {u}}(F(\vec {a}))\). Let’s give a formal definition.
In essence, \(D_{\uvec {u}}(F(\vec {a})\) is the instantaneous rate of change of \(F\) at \(\vec {a}\) as we approach \(\vec {a}\) in the direction of \(\uvec {u}\).
There’s a quick way to compute this limit by using the gradient vector. We first give the result and save the derivation of the formula until the end of the section.
-
Finding \(\uvec {u}\)
Vectors have both a magnitude and a direction. We’ve seen that it is much more challenging to find a vector in the appropriate direction than it is to scale a vector appropriately, so let’s start by finding a vector \(\vec {u}\) parallel to the line \(2x-y=3\).
There are many ways we can do this and one such way is to parameterize the line. Since we can explicitly find \(y=\answer [given]{2x-3}\), we set \(x(t)=t\) and \(y(t) = \answer [given]{2t-3}\). A parameterization is thus
\[ \vec {p}(t) = \vector {x(t),y(t)} = \vector {\answer [given]{t},\answer [given]{2t-3}} \]Now, we need a \(t\)-value for which \(x(t) = 2\), \(y(t)=1\). By inspecting the first component of the parameterization, we find \(t=\answer [given]{2}\). Thus, a vector parallel to the line will be \(\vec {p}'(2)\). We note\[ \vec {p}'(t) = \vector {1,2} \]So \(\vec {p}'(2) = \vector {\answer [given]{1},\answer [given]{2}}\). This is the vector \(\vec {u}\) we will use to be parallel to the line. We now note that \(\vec {u}\) isis not a unit vector.
We find the unit vector the usual way by computing
\[ \uvec {u} = \frac {\vec {u}}{|\vec {u}|} = \frac {\vector {1,2}}{\answer [given]{\sqrt {5}}}. \] -
Finding \(\grad {F}(2,1)\).
Since \(F(x,y) = x^2-3xy+4y^2+7\), we find \(F_x(x,y) = \answer [given]{2x-3y}\) and \(F_y(x,y) = -3x+8y\), so
\[ \grad {F}(x,y) = \vector {\answer [given]{2x-3y},\answer [given]{-3x+8y}} \]Thus, \(\grad {F}(2,1) = \vector {\answer [given]{1},\answer [given]{2}}\).
Now, using the formula \(D_{\uvec {u}}(F(2,1)) =- \grad {F}(2,1) \dotp \uvec {u}\) gives \(D_{\uvec {u}}(F(2,1)) = \vector {1,2} \dotp \vector {\frac {1}{\sqrt {5}},\frac {2}{\sqrt {5}} } = \answer [given]{\frac {5}{\sqrt {5}}}\).
1 Directions of initial change
Consider a surface defined by \(z = F(x,y)\). Given a particular point \((a,b,F(a,b))\) on the surface where \(\grad {F}(a,b) \neq \vec {0}\), there are a few questions we can ask.
- In which initial direction should we travel from \((a,b,F(a,b))\) if we want to head up the surface the fastest?
- In which initial direction should we travel from \((a,b,F(a,b))\) if we want to head down the surface the fastest?
- In which direction should we travel if we do not want our current elevation to change?
The following theorem answers these questions. We state the theorem for functions \(F:\R ^2\to \R \), but it actually holds for functions from \(\R ^n\) to \(\R \).
- The initial direction of greatest increase is in the direction of \(\grad {F}(a,b)\).
- The initial direction of greatest decrease is in the direction of \(-\grad {F}(a,b)\).
- The initial directions of no change are orthogonal to \(\grad {F}(a,b)\).
As another upshot, we actually know exactly what the maximum rate of increase is at \((a,b)\) too. It’s \(D_{\uvec {u}}(F(a,b)) = |\grad {F}(a,b)|\).
We can use similar logic to determine that the maximum rate of decrease, or the “most negative” rate of change occurs in the direction \(\uvec {u}\) opposite the direction of the gradient vector, and that this most negative rate of change is \(D_{\uvec {u}}(F(a,b)) = -|\grad {F}(a,b)|\).
To tackle the direction of no change, we need to find the directions \(\uvec {u}\) for which \(D_{\uvec {u}} F(a,b) =0\). Once again, the formula \(D_{\uvec {u}}(F(a,b)) = \grad {F}(a,b) \dotp \uvec {u}\) comes to the rescue. Setting \(D_{\uvec {u}} F(a,b) =0\) gives that \(= \grad {F}(a,b) \dotp \uvec {u} = 0\), which means that the directions of no change are parallelorthogonal to \(\grad {F}(a,b)\).
We first compute the gradient. Since \(F(x,y) = \sin (xy)+y^2\),
- \(F_x(x,y) = \answer [given]{y\cos (xy)}\) so \(F_x(0,1) = 1\).
- \(F_y(x,y) = x\cos (xy)+2y\), so \(F_y(0,1) = \answer [given]{2}\).
Thus, \(\grad {F}(0,1) = \vector {\answer [given]{1},\answer [given]{2}}\). We can now use this to find the requested directions and rates.
- The initial direction of greatest increase is in the sameoppositeorthogonal direction of the gradient. Since \(|\grad {F}(0,1)| = \sqrt {(1)^2+(2)^2} = \sqrt {5}\), a unit vector \(\uvec {u}\) in the direction of greatest increase is \(\uvec {u} = \vector {\answer [given]{\frac {1}{\sqrt {5}}},\answer [given]{\frac {2}{\sqrt {5}}}}\) and the maximum rate of change is \(|\grad {F}(0,1)| = \sqrt {5}\).
- The initial direction of greatest decrease is in the sameoppositeorthogonal direction of the gradient. A unit vector \(\uvec {u}\) in the direction of greatest increase is \(\uvec {u} = \vector {-\frac {1}{\sqrt {5}},-\frac {2}{\sqrt {5}}}\) and the greatest rate of decrease is \(-|\grad {F}(0,1)| = -\sqrt {5}\).
-
There are two unit vectors in the initial direction of no change. To see why, note that \(\grad {F}(0,1) = \vector {1,2}\), so both the vectors \(\vec {w}_1 =\vector {-2,1}\) and \(\vec {w}_2 =\vector {2,-1}\) are orthogonal to \(\vec {u}\) (Notice that for two dimensional vectors, we can always find a vector orthogonal to a given one by inspection; just flip the components and negate one of them).
The magnitude of both \(\vec {w}_1\) and \(\vec {w}_2\) is \(\answer [given] {\sqrt {5}}\), so the two unit vectors in the initial direction of no change are \(\uvec {w}_1 = \vector {-\frac {2}{\sqrt {5}},\frac {1}{\sqrt {5}}}\) and \(\uvec {w}_2 = \vector {\frac {2}{\sqrt {5}},-\frac {1}{\sqrt {5}}}\).
2 The formula for the directional derivative
We conclude this section by giving the derivation of the formula
Since our function \(F\) is differentiable, we know that when we “zoom in” on the graph of the surface \(z=F(x,y)\), the surface looks like its tangent plane, \(z=L(\vec {x})\), which is mathematized in the definition of differentiability below.
We have seen that we can use the gradient to write the formula for the tangent plane as \(L(\vec {x}) = F(\vec {a}) + \grad {F}(\vec {a}) \dotp (\vec {x}-\vec {a})\). Substituting into the above limit gives
Now, recall that the directional derivative \(D_{\uvec {u}}(F(\vec {a}))\) requires that we approach \(\vec {a}\) along the line \(\vecl (t) = \vec {a}+t\uvec {u}\). Since the above limit exists, the result holds along any path along which \(\vec {x} \to \vec {a}\), so it certainly holds along this path. Letting \(\vec {x}\) approach \(\vec {a}\) along this path is found by setting \(\vec {x} = \vec {a}+t\uvec {u}\), and the limit \(\vec {x} \to \vec {a}\) is now found by taking \(t \to 0\). To simplify, we will consider as \(t \to 0^+\); the argument for the other sided limit is very similar. Now, we update our limit along the chosen path.
where in the last step, we have used the fact that \(|\uvec {u}|=1\) since \(\uvec {u}\) is a unit vector.
Recalling that this limit is \(0\) in the first place gives
and since by definition, \(D_{\uvec {u}}(F(\vec {a})) = \lim _{t \to 0} \frac {F( \vec {a}+t\uvec {u})-F(\vec {a})}{t}\), we have
We may thus conclude that \(D_{\uvec {u}} (F(\vec {a})) = \grad {F}(\vec {a}) \dotp \uvec {u}\).