We investigate the chain rule for functions of several variables.
The chain rule states that
If \(t=g(x)\), we can express the chain rule as
In this section we extend the chain rule to functions of more than one variable.
It is good to understand what the situation of \(F(x,y)\), \(\vec {x}(t) = \vector {x(t),y(t)}\) describes. We know that \(F(x,y)\) describes a surface; we also recognize that \(\vec {x}(t)\) describes a curve in the \((x,y)\)-plane. Combining these together, we are describing a curve that lies on the surface described by \(F\). The parametric equations for this curve are \(x=x(t)\), \(y=y(t)\) and \(F\big (x(t),y(t)\big )\). Consider:
Here a surface is drawn, along with a dashed curve in the \((x,y)\)-plane. Restricting \(F\) to just the points on this circle gives the curve shown on the surface. The derivative \(\dd [F]{t}\) gives the instantaneous rate of change of \(F\) with respect to \(t\).
Now try your hand at the chain rule.
The previous example can make us wonder: if we substituted for \(x\) and \(y\) at the end to show that \(\dd [F]{t}\) is really just a function of \(t\), why not substitute before differentiating, showing clearly that \(F\) is a function of \(t\)?
That is, \(z = x^2y+x = (\sin t)^2e^{5t}+\sin t.\) Applying the chain and product rules, we have
which matches the result from the example.
This may now make one wonder “What’s the point? If we could already find the derivative, why learn another way of finding it?” In some cases, applying this rule makes differentiation simpler, but this is hardly the power of the chain rule. Rather, the chain rule is extremely powerful when we do not know what \(F\), \(x\) and/or \(y\) are. It may be hard to believe, but often in “the real world” we know rate-of-change information (information about derivatives) without explicitly knowing the underlying functions. The chain rule allows us to combine several rates of change to find another rate of change.
The chain rule also tells us something about the meaning of the gradient. As we will see, the gradient vector is always orthogonal to level curves and surfaces.
this tells us that the gradient is orthogonal to the tangent vectors of our level curve. This means that the gradient is orthogonal to level curves.
Note that the last explanation works in any dimension. The up-shot?
Gradient vectors are orthogonal to level sets.
This is a key concept concerning the gradient.
1 New solutions for old problems
We can also use our new chain rule to revisit problems from our previous studies of calculus. Our new tools allow for simpler solutions to these problems.
1.1 Differentiating integrals
Recall the following form of the Fundamental Theorem of Calculus:
It is easy to use the Fundamental Theorem of Calculus to differentiate integrals, when the limits of integration are a constant and a variable. However, when the limits are functions, things get more complicated. The multivariable chain rule helps out in these situations.
Now,
And
So
1.2 Implicit differentiation
We’ve used implicit differentiation to compute \(\dd [y]{x}\) when \(y\) is given as an implicit function of \(x\). Now we’ll revisit this with the chain rule and give a new, simpler, method of finding \(\dd [y]{x}\).
For instance, consider the implicit function \(x^2y-xy^3=3\). We learned to use the following steps to find \(\dd [y]{x}\):
Instead of using this method, consider \(z=x^2y-xy^3\). The implicit function above describes the level curve \(z=3\). Considering \(x\) and \(y\) as functions of \(x\), the chain rule states that
Since \(z\) is constant (in our example, \(z=3\)), \(\dd [z]{x} = 0\). We also know \(\dd [x]{x} = 1\). Write with me,
Note how our solution for \(\dd [y]{x}\) above is just the partial derivative of \(z\), with respect to \(x\), divided by the partial derivative of \(z\) with respect to \(y\). We state the above as a theorem.
Try your hand at this.