Partial derivatives

$\newenvironment {prompt}{}{} \newcommand {\ungraded }[0]{} \newcommand {\todo }[0]{} \newcommand {\oiint }[0]{{\large \bigcirc }\kern -1.56em\iint } \newcommand {\mooculus }[0]{\textsf {\textbf {MOOC}\textnormal {\textsf {ULUS}}}} \newcommand {\npnoround }[0]{\nprounddigits {-1}} \newcommand {\npnoroundexp }[0]{\nproundexpdigits {-1}} \newcommand {\npunitcommand }[1]{\ensuremath {\mathrm {#1}}} \newcommand {\RR }[0]{\mathbb R} \newcommand {\R }[0]{\mathbb R} \newcommand {\N }[0]{\mathbb N} \newcommand {\Z }[0]{\mathbb Z} \newcommand {\sagemath }[0]{\textsf {SageMath}} \newcommand {\d }[0]{\,d} \newcommand {\l }[0]{\ell } \newcommand {\ddx }[0]{\frac {d}{\d x}} \newcommand {\zeroOverZero }[0]{\ensuremath {\boldsymbol {\tfrac {0}{0}}}} \newcommand {\inftyOverInfty }[0]{\ensuremath {\boldsymbol {\tfrac {\infty }{\infty }}}} \newcommand {\zeroOverInfty }[0]{\ensuremath {\boldsymbol {\tfrac {0}{\infty }}}} \newcommand {\zeroTimesInfty }[0]{\ensuremath {\small \boldsymbol {0\cdot \infty }}} \newcommand {\inftyMinusInfty }[0]{\ensuremath {\small \boldsymbol {\infty -\infty }}} \newcommand {\oneToInfty }[0]{\ensuremath {\boldsymbol {1^\infty }}} \newcommand {\zeroToZero }[0]{\ensuremath {\boldsymbol {0^0}}} \newcommand {\inftyToZero }[0]{\ensuremath {\boldsymbol {\infty ^0}}} \newcommand {\numOverZero }[0]{\ensuremath {\boldsymbol {\tfrac {\#}{0}}}} \newcommand {\dfn }[0]{\textbf } \newcommand {\unit }[0]{\mathop {}\!\mathrm } \newcommand {\eval }[1]{\bigg [ #1 \bigg ]} \newcommand {\seq }[1]{\left ( #1 \right )} \newcommand {\epsilon }[0]{\varepsilon } \newcommand {\phi }[0]{\varphi } \newcommand {\iff }[0]{\Leftrightarrow } \DeclareMathOperator {\arccot }{arccot} \DeclareMathOperator {\arcsec }{arcsec} \DeclareMathOperator {\arccsc }{arccsc} \DeclareMathOperator {\si }{Si} \DeclareMathOperator {\scal }{scal} \DeclareMathOperator {\sign }{sign} \newcommand {\arrowvec }[1]{{\overset {\rightharpoonup }{#1}}} \newcommand {\vec }[1]{{\overset {\boldsymbol {\rightharpoonup }}{\mathbf {#1}}}\hspace {0in}} \newcommand {\point }[1]{\left (#1\right )} \newcommand {\pt }[1]{\mathbf {#1}} \newcommand {\Lim }[2]{\lim _{\point {#1} \to \point {#2}}} \DeclareMathOperator {\proj }{\mathbf {proj}} \newcommand {\veci }[0]{{\boldsymbol {\hat {\imath }}}} \newcommand {\vecj }[0]{{\boldsymbol {\hat {\jmath }}}} \newcommand {\veck }[0]{{\boldsymbol {\hat {k}}}} \newcommand {\vecl }[0]{\vec {\boldsymbol {\l }}} \newcommand {\uvec }[1]{\mathbf {\hat {#1}}} \newcommand {\utan }[0]{\mathbf {\hat {t}}} \newcommand {\unormal }[0]{\mathbf {\hat {n}}} \newcommand {\ubinormal }[0]{\mathbf {\hat {b}}} \newcommand {\dotp }[0]{\bullet } \newcommand {\cross }[0]{\boldsymbol \times } \newcommand {\grad }[0]{\boldsymbol \nabla } \newcommand {\divergence }[0]{\grad \dotp } \newcommand {\curl }[0]{\grad \cross } \newcommand {\lto }[0]{\mathop {\longrightarrow \,}\limits } \newcommand {\bar }[0]{\overline } \newcommand {\surfaceColor }[0]{violet} \newcommand {\surfaceColorTwo }[0]{redyellow} \newcommand {\sliceColor }[0]{greenyellow} \newcommand {\vector }[1]{\left \langle #1\right \rangle } \newcommand {\sectionOutcomes }[0]{} \newcommand {\HyperFirstAtBeginDocument }[0]{\AtBeginDocument }$

We introduce partial derivatives and the gradient vector.

Given a function $F:\R ^n \to \R$ , it is often useful to differentiate with respect to a single variable and hold the other variables as constants. One way to think of a function of several variables is as a “machine” with lots of knobs:

One way to try and understand the machine above would be to hold all but one of the knobs constant, and see what happens when you “wiggle” a single knob. As a explicit example, let $F(x,y) = x^2+2y^2$ Here $F$ is our “machine” and the variables $x$ and $y$ are the “knobs.” Fixing $y=2$ , allows us to focus our attention to all points on the surface where the $y$ -value is $2$ ,

We can now focus our attention on the curve

and differentiate this curve purely with respect to $x$ . In a similar way, we could fix $x$ and differentiate with respect to $y$ .

Given a function $F:\R ^n\to \R$ , the partial derivative of $F$ with respect to the $i$ th variable is denoted: $\pp {x_i} F(x_1,x_2,\dots ,x_n) = \pp {x_i} F(\vec {x})$ $= \lim _{h\to 0} \frac {F(x_1,\dots ,x_i+h,\dots ,x_n) - F(x_1,\dots ,x_n)}{h}$ This means that one should take the single-variable derivative with respect to $x_i$ of $F$ while treating all other variables as constants.

The following interactive let’s you see whats going on with partial derivatives:

Let $F(x,y) = x^2+2y^2$ . Compute: $\pp {x} F(x,y) \begin {prompt} = \answer {2x+0} \end {prompt}$

Compute $\pp {y} F(x,y) \begin {prompt} = \answer {0+4y} \end {prompt}$

There are several different notations for the partial derivative. We’ll mainly be using these: $\begin{align*} \pp{x} F(x,y,z) &= F^{(1,0,0)}(x,y,z) = F_x(x,y,z),\\ \pp{y} F(x,y,z) &= F^{(0,1,0)}(x,y,z) = F_y(x,y,z),\\ \pp{z} F(x,y,z) &= F^{(0,0,1)}(x,y,z) = F_z(x,y,z). \end{align*}$

Let $F(x,y) = x^2y + 2x+y^3$ . Compute: $F^{(1,0)}(x,y) \begin {prompt} = \answer {2xy+2} \end {prompt}$

Compute: $F^{(0,1)}(x,y) \begin {prompt} = \answer {x^2+3y^2} \end {prompt}$

We have shown how to compute a partial derivative, but it may still not be clear what a partial derivative means. Given $z=F(x,y)$ , $F^{(1,0)}(x,y)$ measures the rate at which $z$ changes as only $x$ varies: $y$ is held constant.

Imagine standing in a rolling meadow, then beginning to walk due east. Depending on your location, you might walk up, sharply down, or perhaps not change elevation at all. This is similar to measuring $\pp [z]{x}$ : you are moving only east (in the $x$ -direction) and not north/south at all. Going back to your original location, imagine now walking due north (in the $y$ -direction). Perhaps walking due north does not change your elevation at all. This is analogous to $\pp [z]{y}=0$ : $z$ does not change with respect to $y$ . We can see that $\pp [z]{x}$ and $\pp [z]{y}$ do not have to be the same, or even similar, as it is easy to imagine circumstances where walking east means you walk downhill, though walking north makes you walk uphill. The next example helps us visualize this.

Let $F(x,y)=-x^2-\frac {y^2}{2}+xy+10$ . Find $F^{(1,0)}(2,1)$ and $F^{(0,1)}(2,1)$ .

Write with me $\pp {x}F(x,y) = \answer [given]{-2x+y}$ and $\pp {y}F(x,y) = \answer [given]{-y+x}$ Thus $F^{(1,0)}(2,1) = \answer [given]{-3}$ and $F^{(0,1)}(2,1) = \answer [given]{1}$ .

Whenever we do a computation in mathematics, we should ask ourselves, “What does this mean?”

Let $F(x,y)=-x^2-\frac {y^2}{2}+xy+10$ . What is the meaning of $\begin{align*} F^{(1,0)}(2,1) &= -3\\ F^{(0,1)}(2,1)&=1? \end{align*}$

First note that $F(2,1) = \answer [given]{7.5}$ . If $F^{(1,0)}(2,1)=-3$ , this means if one “stands” on the surface at the point $\left (\answer [given]{2},\answer [given]{1},\answer [given]{7.5}\right )$ and moves to the $x$ -axis (so only the $x$ -value changes, not the $y$ -value), then the instantaneous rate of change in $z$ is $\answer [given]{-3}$ . Increasing the $x$ -value will the $z$ -value; decreasing the $x$ -value will the $z$ -value.

If $F^{(0,1)}(2,1)=1$ , this means if one “stands” on the surface at the point $(\answer [given]{2},\answer [given]{1},\answer [given]{7.5})$ and moves

to the $y$ -axis (so only the $y$ -value changes, not the $x$ -value), then the instantaneous rate of change in $z$ is $\answer [given]{1}$ . Increasing the $y$ -value will the $z$ -value; decreasing the $y$ -value will the $z$ -value.

Finally, since the magnitude of $\pp [F]{x}$ is greater than the magnitude of $\pp [F]{y}$ at $(2,1)$ , the surface is “steeper” in the $x$ -direction than in the $y$ -direction.

Estimating partial derivatives

Functions of several variables, especially ones that map $\R ^2\to \R$ can be described by a table of values or level curves. In either case we can estimate partial derivatives by looking at $\frac {\text {change in the output}}{\text {change in the variable}}$ Let’s do an example to make this more clear.

Let $F:\R ^2\to \R$ be a differentiable function described by the following table of values:

Estimate $F^{(1,0)}(2,6)$ .

To estimate $F^{(1,0)}(2,6)$ , we examine the change in $F(x,6)$ between $x=1$ and $x=2$ and then between $x=2$ and $x=3$ . We will then average these estimates to find our answer. To start look between $x=1$ and $x=2$ : $\begin{align*} \frac{F(2,6)-F(1,6)}{2-1}&= \frac{\answer[given]{16}-\answer[given]{24}}{2-1}\\ &=\answer[given]{-8} \end{align*}$ Now examine the change in $F(x,6)$ between $x=2$ and $x=3$ : $\begin{align*} \frac{F(3,6)-F(2,6)}{3-2}&= \frac{\answer[given]{5}-\answer[given]{16}}{3-2}\\ &=\answer[given]{-11} \end{align*}$ Now if we average these values together, we see: $\eval {\pp {x} F(x,y)}_{(x,y)=(2,6)} \approx \answer [given]{-9.5}$

Let $F:\R ^2\to \R$ be a differentiable function described by the following table of values:

Estimate $F^{(0,1)}(2,6)$ . $F^{(0,1)}(2,6)\approx \answer {.5}$

Work as we did in the example above, finding two estimates and taking their averages.

We can also estimate partial derivatives by looking at level curves.

Let $F:\R ^2\to \R$ be described by the level curves below:

The height of the level curve is marked on the curve, and we are given a point $(4,2)$ . Estimate $F^{(1,0)}(4,2)$ .

To estimate $F^{(1,0)}(4,2)$ , we examine the change between the level curve that $\vec {p}$ is on and the nearest level curve found by traveling on a line parallel to the $x$ -axis. Starting at $\vec {p}$ and moving to the left on a line parallel to the $x$ -axis, we see $\begin{align*} \frac{F(4,2)-F(1,2)}{4-1}&= \frac{\answer[given]{13}-\answer[given]{7}}{4-1}\\ &=\answer[given]{2} \end{align*}$ We also should examine the change between the closest level curve when moving to the right: $\begin{align*} \frac{F(5,2)-F(4,2)}{5-4}&= \frac{\answer[given]{17}-\answer[given]{13}}{5-4}\\ &=\answer[given]{4} \end{align*}$ Now if we average these values together, we see: $\eval {\pp {x} F(x,y)}_{(x,y)=(2,6)} \approx \answer [given]{3}$

Let $F:\R ^2\to \R$ be described by the level curves below:

The height of the level curve is marked on the curve, and we are given a point $(4,2)$ . Estimate $F^{(0,1)}(4,2)$ . $\begin {prompt} F^{(0,1)}(4,2)\approx \answer {-3.5} \end {prompt}$

Work as we did in the example above, finding two estimates and taking their averages.

Combining partial derivatives

While a function $f:\R \to \R$ only has one second derivative. However, functions $F:\R ^2\to \R$ have $4$ second partial derivatives and functions $F:\R ^3\to \R$ have $9$ second partial derivatives! Don’t run off yet, things get better.

Let $F:\R ^n\to \R$ be continuous on an open set $S$ .

The second pure partial derivative of $F$ with respect to $x$ then $x$ is $\pp {x}\left (\pp [F]{x}\right ) = F_{xx}=\frac {\partial ^2 F}{\partial x^2} = \left (F^{(1,0)}\right )^{(1,0)}= F^{(2,0)}$
The second pure partial derivative of $F$ with respect to $y$ then $y$ is $\pp {y}\left (\pp [F]{y}\right ) = F_{yy}=\frac {\partial ^2F}{\partial y^2} = \left (F^{(0,1)}\right )^{(0,1)} = F^{(0,2)}$

Moreover, there is also the notion of a mixed partial derivative, $\frac {\partial }{\partial y}\left (\frac {\partial F}{\partial x}\right ) = \frac {\partial ^2F}{\partial y\partial x} = \left (F_x\right )_y = F_{xy}$ and $\frac {\partial }{\partial x}\left (\frac {\partial F}{\partial y}\right ) = \frac {\partial ^2F}{\partial x\partial y} = \left (F_y\right )_x =F_{yx}$ The notation $F^{(1,1)}$ is ambiguous, it does not state which derivative should be taken first. As we will see, in practice this is not too much of a problem.

Consider: $F(x,y) = x^3y^2 + 2xy^3+\cos (x)$ Find six first and second partial derivatives. $\begin{align*} \pp[F]{x} &= \answer{3x^2y^2+2y^3-\sin(x)}\\ \pp[F]{y} &= \answer{2x^3y+6xy^2}\\ \frac{\partial^2F}{\partial x^2} &= \answer{6xy^2-\cos(x)}\\ \frac{\partial^2F}{\partial y^2} &= \answer{2x^3+12xy}\\ \frac{\partial^2F}{\partial y\partial x} &= \answer{6x^2y+6y^2}\\ \frac{\partial^2F}{\partial x\partial y} &= \answer{6x^2y+6y^2} \end{align*}$

Notice how above $\frac {\partial ^2F}{\partial y\partial x}=\frac {\partial ^2F}{\partial x\partial y}$ . The next theorem states that it is not a coincidence.

Mixed Partial Derivatives Let $F:\R ^2\to \R$ be a function where $\frac {\partial ^2F}{\partial y\partial x}\quad \text {and}\quad \frac {\partial ^2F}{\partial x\partial y}$ are continuous on an open set $S$ . Then for each point $(x,y)$ in $S$ , $\frac {\partial ^2F}{\partial y\partial x}=\frac {\partial ^2F}{\partial x\partial y}$ . A similar result is true for functions $F:\R ^n\to \R$ .

Finding $\frac {\partial ^2F}{\partial y\partial x}$ and $\frac {\partial ^2F}{\partial x\partial y}$ independently and comparing the results provides a convenient way of checking our work.

The gradient vector

Given a function $F:\R ^n\to \R$ , we often want to work with all of first partial derivatives simultaneously. In this case, we will work with the vector: $\vector {\pp [F]{x_1},\pp [F]{x_2},\dots ,\pp [F]{x_n}}$ As we will see, for functions of several variables, this vector will play the role that the derivative did for functions of a single variable. This vector is called the gradient vector.

Let $F:\R ^n\to \R$ be a function whose first partial derivatives exist, the gradient $\grad F = \vector {\pp [F]{x_1},\pp [F]{x_2},\dots ,\pp [F]{x_n}}$ is a vector-valued function of $n$ variables.

The upside-down triangle in the notation for the gradient sometimes called a del. It is also known as a nabla. You can think of the $\grad$ as the vector: $\grad = \vector {\pp {x_1},\pp {x_2},\dots ,\pp {x_n}}$ and hence when one writes: $\grad F$ , you are literally distributing the $F$ across the vector, just as a scalar acts on a vector.

The gradient is defined at points in the domain where the partial derivatives are defined. The gradient of a function of two variables lives in $\R ^2$ . The gradient of a function of three variables lives in $\R ^3$ . Generally the gradient of a function of $n$ variables lives in $\R ^n$ . We can see this in the interactive below.

The gradient at each point is a vector pointing in the $(x,y)$ -plane.

Try your hand at some casual computations.

Let $F(x,y) = \sin (x)\cos (y)$ , compute: $\grad F(x,y) \begin {prompt} = \vector {\answer {\cos (x)\cos (y)},\answer {-\sin (x)\sin (y)}} \end {prompt}$

Above, note that $\grad F(x,y)$ is a vector whose components are functions of $x$ and $y$ , hence it is a vector-valued function. We can evaluate functions at actual points in their domain. For instance, if $\vec {p}= \vector {\pi /3,\pi /3}$ we compute: $\grad F(\vec {p}) \begin {prompt} =\vector {\answer {1/4},\answer {-3/4}}. \end {prompt}$

And now in three variables.

Let $F(x,y,z) = ze^{-7xy}$ , compute: $\grad F(x,y,z) \begin {prompt} = \vector {\answer {z e^{-7xy}(-7)y},\answer {z e^{-7xy}(-7)x}, \answer {e^{-7xy}}} \end {prompt}$

Let $\vec {p}= \vector {1,0,1/7}$ . Compute: $\grad F(\vec {p}) \begin {prompt} =\vector {\answer {0},\answer {-1},\answer {1}} \end {prompt}$

This is just your first taste of the gradient vector. Much more will be coming soon.

Press...	...to do
left/right arrows	Move cursor
shift+left/right arrows	Select region
ctrl+a	Select all
ctrl+x/c/v	Cut/copy/paste
ctrl+z/y	Undo/redo
ctrl+left/right	Add entry to list or column to matrix
shift+ctrl+left/right	Add copy of current entry/column to to list/matrix
ctrl+up/down	Add row to matrix
shift+ctrl+up/down	Add copy of current row to matrix
ctrl+backspace	Delete current entry in list or column in matrix
ctrl+shift+backspace	Delete current row in matrix

Type...	...to get
norm	$\|\|\blue{[?]}\|\|$
text	$\text{\blue{[?]}}$
sym_name	$\backslash\texttt{\blue{[?]}}$
abs	$\left\|\blue{[?]}\right\|$
sqrt	$\sqrt{\blue{[?]}}$
paren	$\left(\blue{[?]}\right)$
floor	$\lfloor \blue{[?]} \rfloor$
factorial	$\blue{[?]}!$
exp	${\blue{[?]}}^{\blue{[?]}}$
sub	${\blue{[?]}}_{\blue{[?]}}$
frac	$\dfrac{\blue{[?]}}{\blue{[?]}}$
int	$\displaystyle\int{\blue{[?]}}d\blue{[?]}$
defi	$\displaystyle\int_{\blue{[?]}}^{\blue{[?]}}\blue{[?]}d\blue{[?]}$
deriv	$\displaystyle\frac{d}{d\blue{[?]}}\blue{[?]}$
sum	$\displaystyle\sum_{\blue{[?]}}^{\blue{[?]}}\blue{[?]}$
prod	$\displaystyle\prod_{\blue{[?]}}^{\blue{[?]}}\blue{[?]}$
root	$\sqrt[\blue{[?]}]{\blue{[?]}}$
vec	$\left\langle \blue{[?]} \right\rangle$
mat	$\left(\begin{matrix} \blue{[?]} \end{matrix}\right)$
*	$\cdot$
infinity	$\infty$
arcsin	$\arcsin\left(\blue{[?]}\right)$
arccos	$\arccos\left(\blue{[?]}\right)$
arctan	$\arctan\left(\blue{[?]}\right)$
sin	$\sin\left(\blue{[?]}\right)$
cos	$\cos\left(\blue{[?]}\right)$
tan	$\tan\left(\blue{[?]}\right)$
sec	$\sec\left(\blue{[?]}\right)$
csc	$\csc\left(\blue{[?]}\right)$
cot	$\cot\left(\blue{[?]}\right)$
log	$\log\left(\blue{[?]}\right)$
ln	$\ln\left(\blue{[?]}\right)$
alpha	$\alpha$
beta	$\beta$
gamma	$\gamma$
delta	$\delta$
epsilon	$\epsilon$
zeta	$\zeta$
eta	$\eta$
theta	$\theta$
iota	$\iota$
kappa	$\kappa$
lambda	$\lambda$
mu	$\mu$
nu	$\nu$
xi	$\xi$
omicron	$\omicron$
pi	$\pi$
rho	$\rho$
sigma	$\sigma$
tau	$\tau$
upsilon	$\upsilon$
phi	$\phi$
chi	$\chi$
psi	$\psi$
omega	$\omega$
Gamma	$\Gamma$
Delta	$\Delta$
Theta	$\Theta$
Lambda	$\Lambda$
Xi	$\Xi$
Pi	$\Pi$
Sigma	$\Sigma$
Phi	$\Phi$
Psi	$\Psi$
Omega	$\Omega$

Estimating partial derivatives

Combining partial derivatives

The gradient vector

Controls

Symbols

Settings