We introduce a way of analyzing the rate of change in a given direction.

For functions of several variables, partial derivatives measure the rate of change when changing only one of the inputs. We can think of partial derivatives geometrically if we consider the surface \(z=F(x,y)\). Let’s imagine that our surface is a hill and consider \(F_y(a,b)\). This tell us we should hold \(x\) constant and see how \(z= F(x,y)\) changes as \(y\) changes. In essence, we imagine a path “parallel” to the \(y\)-axis and note how \(z=F(x,y)\) changes:

We can now interpret \(F_y(a,b)\) as either:

  • the slope of the hill if we walk along it in a direction parallel to the \(y\)-axis.
  • the instantaneous rate of change of \(F(x,y)\) at \((a,b)\) as we approach \((a,b)\) along the line \(x=a\).

We have a similar interpretation of \(F_x(a,b)\). However, there is no reason that we must approach \((a,b)\) along a line that is parallel to one of the coordinate axes. What if we want to approach along any line? Consider, for example, this line:

Indeed, once we are at a point on the surface above \(\vec {a}=\vector {a,b}\), there are actually many different directions that we can travel along the hill. Let’s consider a line \(\vecl \) that passes through \(\vec {a}\) as our path in the domain. Our rate of change will be given by

\[ \textrm {rate of change} = \frac {\textrm {rise}}{\textrm {run}}, \]

where the “run” is the distance traveled along the line \(\vecl \) and the “rise” is the corresponding change in the \(z\)-values of the function. Since we are ultimately concerned about a curve on the surface, a good first step is to parameterize the line in the domain, then use the function to find a parametric description of the curve on the surface above \(\vecl \).

In order to make computing the run most efficiently, we pick a unit vector \(\uvec {u}\) in the direction \(\vecl \) is drawn. We’ll see how to do this in the next example, but we can always start at \(\vec {a}\) and draw a unit vector that extends from \(\vec {a}\) along \(\vecl \).

To find a parameterization of \(\vecl \), note that \(\uvec {u}\) is parallel to \(l\) and \(\vec {a}\) is a point on the line, so letting \(h\) denote the parameter, a description of \(\vecl \) is given by

\[ \vecl (h) = \vec {a} + h \uvec {u} \]

For the sake of example, let \(h>0\) (a similar argument can be given if \(h<0\)). One convenient consequence of using a unit vector in the direction of \(\vecl \) is that the “run,” which is the distance between \(\vec {a}\) and \(\vec {a}+h\uvec {u}\) is simply

\[ \textrm {``run'' } = \left |\vec {a}+h\uvec {u} - \vec {a}\right | = |h \uvec {u}| = |h| |\uvec {u}| = h \]

since \(h>0\) and \(\uvec {u}\) is a unit vector.

The “rise” is computed by noting that it is the corresponding change in \(z\)-values.

\[ \textrm { ``rise'' } = F(\vec {a}+h\uvec {u})-F(\vec {a}) \]

These are shown in the image below.

To find the instantaneous rate of change, we take the limit as \(h\) goes to \(0\) (since \(F\) is differentiable, it can be shown this limit must exist). We call the result the directional derivative of \(F\) at \(\vec {a}\) in the direction \(\uvec {u}\) and will henceforth denote this by \(D_{\uvec {u}}(F(\vec {a}))\). Let’s give a formal definition.

In essence, \(D_{\uvec {u}}(F(\vec {a})\) is the instantaneous rate of change of \(F\) at \(\vec {a}\) as we approach \(\vec {a}\) in the direction of \(\uvec {u}\).

There’s a quick way to compute this limit by using the gradient vector. We first give the result and save the derivation of the formula until the end of the section.

Now that we have defined and worked with the directional derivative, what does it tell us?
The instantaneous rate of change of \(F(x,y)\) at the point \((1,2)\) as we approach it in the direction parallel to \(\vector {\frac {1}{\sqrt {5}},\frac {2}{\sqrt {5}}}\). The slope of the tangent line to the curve on the surface \(z= x^2-3xy+4y^2+7\) above the line \(2x-y=3\) in its domain. The normal vector to the surface at the point. The slope of the tangent plane.

1 Directions of initial change

Consider a surface defined by \(z = F(x,y)\). Given a particular point \((a,b,F(a,b))\) on the surface where \(\grad {F}(a,b) \neq \vec {0}\), there are a few questions we can ask.

  • In which initial direction should we travel from \((a,b,F(a,b))\) if we want to head up the surface the fastest?
  • In which initial direction should we travel from \((a,b,F(a,b))\) if we want to head down the surface the fastest?
  • In which direction should we travel if we do not want our current elevation to change?

The following theorem answers these questions. We state the theorem for functions \(F:\R ^2\to \R \), but it actually holds for functions from \(\R ^n\) to \(\R \).

2 The formula for the directional derivative

We conclude this section by giving the derivation of the formula

\[ D_{\uvec {u}}(F(\vec {a}))=\grad {F}(\vec {a})\dotp \uvec {u}. \]

Since our function \(F\) is differentiable, we know that when we “zoom in” on the graph of the surface \(z=F(x,y)\), the surface looks like its tangent plane, \(z=L(\vec {x})\), which is mathematized in the definition of differentiability below.

\[ \lim _{\vec {x} \to \vec {a} } \frac {F(\vec {x})-L(\vec {x})}{|\vec {x}-\vec {a}|} = 0 \]

We have seen that we can use the gradient to write the formula for the tangent plane as \(L(\vec {x}) = F(\vec {a}) + \grad {F}(\vec {a}) \dotp (\vec {x}-\vec {a})\). Substituting into the above limit gives

\[ \lim _{\vec {x} \to \vec {a} } \frac {F(\vec {x})-F(\vec {a}) - \grad {F}(\vec {a}) \dotp (\vec {x}-\vec {a})}{|\vec {x}-\vec {a}|}=0 . \]

Now, recall that the directional derivative \(D_{\uvec {u}}(F(\vec {a}))\) requires that we approach \(\vec {a}\) along the line \(\vecl (t) = \vec {a}+t\uvec {u}\). Since the above limit exists, the result holds along any path along which \(\vec {x} \to \vec {a}\), so it certainly holds along this path. Letting \(\vec {x}\) approach \(\vec {a}\) along this path is found by setting \(\vec {x} = \vec {a}+t\uvec {u}\), and the limit \(\vec {x} \to \vec {a}\) is now found by taking \(t \to 0\). To simplify, we will consider as \(t \to 0^+\); the argument for the other sided limit is very similar. Now, we update our limit along the chosen path.

\begin{align*} 0 & =\lim _{\vec {x} \to \vec {a} } \frac {F(\vec {x})-F(\vec {a}) - \grad {F}(\vec {a}) \dotp (\vec {x}-\vec {a})}{|\vec {x}-\vec {a}|} \\ &= \lim _{t \to 0} \frac {F( \vec {a}+t\uvec {u})-F(\vec {a}) + \grad {F}(\vec {a}) \dotp ( \vec {a}+t\uvec {u}-\vec {a})}{| \vec {a}+t\uvec {u}-\vec {a}|}\\ &= \lim _{t \to 0} \frac {F( \vec {a}+t\uvec {u})-F(\vec {a}) - \grad {F}(\vec {a}) \dotp (t\uvec {u})}{| t\uvec {u}|} \\ &= \lim _{t \to 0} \frac {F( \vec {a}+t\uvec {u})-F(\vec {a})}{t} - \grad {F}(\vec {a}) \dotp \uvec {u}\\ \end{align*}

where in the last step, we have used the fact that \(|\uvec {u}|=1\) since \(\uvec {u}\) is a unit vector.

Recalling that this limit is \(0\) in the first place gives

\[ \lim _{t \to 0} \frac {F( \vec {a}+t\uvec {u})-F(\vec {a})}{t} - \grad {F}(\vec {a}) \dotp \uvec {u} = 0, \]

and since by definition, \(D_{\uvec {u}}(F(\vec {a})) = \lim _{t \to 0} \frac {F( \vec {a}+t\uvec {u})-F(\vec {a})}{t}\), we have

\[ D_{\uvec {u}} (F(\vec {a})) - \grad {F}(\vec {a}) \dotp \uvec {u} = 0. \]

We may thus conclude that \(D_{\uvec {u}} (F(\vec {a})) = \grad {F}(\vec {a}) \dotp \uvec {u}\).