
The gradient is the fundamental notion of a derivative for a function of several variables.

First: You must know how to compute the gradient vector. Remember given a function $F:\R ^n\to \R$: This is a vector-valued function of $n$ variables. This means when you compute the gradient, you should express it as a vector!

Second: The gradient vector points in the initial direction of greatest increase for a function. Remember, the gradient vector of a function of $n$ variables is a vector that lives in $\R ^n$. The gradient vector tells you how to immediately change the values of the inputs of a function to find the initial greatest increase in the output of the function. We can see this in the interactive below.

The gradient at each point shows you which direction to change the $(x,y)$-values to get the greatest initial change in the $z$-value.

Third: The gradient vector is orthogonal to level sets. In particular, given $F:\R ^2\to \R$, the gradient vector $\grad F\in \R ^2$ is always orthogonal to the level curves $c = F(x,y)$. Moreover, given $F:\R ^3\to \R$, $\grad F \in \R ^3$ is always orthogonal to level surfaces.

Given a function of several variables, say $F:\R ^2\to \R$, the gradient, when evaluated at a point in the domain of $F$, is a vector in $\R ^2$. We can see this in the interactive below.

The gradient at each point is a vector pointing in the $(x,y)$-plane. You compute the gradient vector, by writing the vector: You’ve done this sort of direct computation many times before. So now, try your hand at these puzzlers:

Consider a differentiable function $F:\R ^2\to \R$ whose tangent plane at $(x,y) = (2,-1)$ is given by: In this case what is $F(2,-1)$?
Suppose you know that $F^{(1,0)}(2,-1)>0$. What is $\grad F(2,-1)$?
Consider a differentiable function $G:\R ^2\to \R$ and the unit vector $\uvec {u} = \vector {1/\sqrt {2},1/\sqrt {2}}$. Suppose that $D_{\uvec {u}} (G(1,-3)) = 0$ and that $G^{(0,1)}(1,-3)=2$. Compute:
Consider a differentiable function $H:\R ^2\to \R$ where $H^{(0,1)}(-5,6) = 3$ and the line $\vecl (t) = \vector {1-2t,3+t}$. Suppose that Compute:
Use the chain rule.

### The initial greatest increase

Given a function $F:\R ^n\to \R$ and point in $\R ^n$, the gradient vector tells you which initial direction to leave the point in order to get the greatest increase in $F$. Why is this so? Well, to compute the change in the output of a function when changing the inputs in a specific direction, we should use the directional derivative. Recall: To make this change as large as possible, $\uvec {u}$ must be the same direction as $\grad F$. Hence, it is the gradient vector that points in the initial direction of greatest increase for the function.

We can directly witness that the gradient vector points in the initial direction of greatest increase by looking at a differentiable function $F:\R ^2\to \R$ that is described by a table of values.

Here is a plot of an elliptic paraboloid $G(x,y) = x^2 + y^2$ along with a vector attached to a point on the surface: True or false: The vector above could be the gradient vector for $G$ at the given point.
True False

So far we have mostly talked about the direction of the gradient vector. Now let’s talk about the magnitude of the gradient vector. The magnitude of the gradient vector tells you “how fast” the function is increasing.

Suppose you have a differentiable function $F:\R ^2\to \R$ with the following set of level curves. You should interpolate reasonable values of the function $F$ between the level curves which are shown: Consider the points $A$, $B$, and $C$ on the surface $z=F(x,y)$. Where $|\grad F|$ largest? The magnitude of the gradient vector of $F$ is largest at point $\answer [format=string]{B}$. Where is $|\grad F|$ smallest? The magnitude of the gradient vector of $F$ is smallest at point $\answer [format=string]{C}$.

Now, stand back. We’re going to do some serious calculus. Just read, relax and enjoy.

What were you supposed to learn from that last example?

Now that we know gradient vectors point in the initial direction of the greatest increase of the function, let’s think about the geometry of the gradient vector. Previously we used the chain rule to show that the gradient vector is always orthogonal to level sets. The argument went like this: Suppose that a vector-valued function $\vec {c}(t)=\vector {x(t),y(t)}$ runs along a level surface for the surface $F(x,y)$. If we ask ourselves: “What is the change in $F$ as $t$ varies?” We must conclude that since the value of $F$ doesn’t change on the curve drawn by $\vec {c}$ (remember, $\vec {c}$ draws a level curve). On the other hand, by the chain rule: The vector $\vec {c}'$ is tangent to the curve drawn by $\vec {c}$, and putting the two equations above together we see so $\grad F(\vec {c}(t))$ must be orthogonal to $\vec {c}'$, and hence orthogonal to the curve drawn by $\vec {c}$.

The explanation we just gave is a good one, but let’s give one more. In this book, we are always thinking about differentiable functions. Remember, a function $F:\R ^2\to \R$ is differentiable if one can “zoom-in” and eventually the function will look like a plane. So let’s imagine that we’ve “zoomed-in” on a differentiable function and it looks like a plane. The contour plot of a plane looks like a bunch of parallel lines:

If we wish to leave the point above in the direction of the initial greatest increase, then we should move in a direction perpendicular to the level curves: Gradient vectors point in the initial direction of greatest increase and the fastest way to leave a line is perpendicular to that line.

The fact that the gradient is always orthogonal to level surfaces is very powerful. In fact it gives new (easier!) solutions to old problems. Let’s use this fact to find a plane tangent to a surface.

Now let’s see a more in-depth problem.

### Summary

To conclude, we will repeat ourselves: There are three things you must know about the gradient vector:

First: You must know how to compute the gradient vector. Second: The gradient vector points in the initial direction of greatest increase for a function. Third: The gradient vector is orthogonal to level sets.