We’ll begin by recalling the chain rule from single variable calculus. If we have differentiable functions and , then we can compute the derivative of the composition as .
We can also use the chain rule to differentiate the composition .
The Chain Rule
The multi-variable chain rule is similar, with the derivative matrix taking the place of the single variable derivative, so that the chain rule will involve matrix multiplication. We also need to pay extra attention to whether the composition of functions is even defined.
In particular, suppose we have functions and . In order for the composition to be defined, the outputs of need to be sensible inputs for . This means that we would need , so that .
If isn’t defined on all of , so that , then the range of would need to be contained in , in order for the composition to be defined. Alternatively, we could restrict the domain of to ensure that the range of is contained in the domain of .
Then the composition is differentiable at , and
Although the conditions sound complicated, essentially they’re just requiring that all of the derivatives mentioned actually exist. Note the similarities to the single variable chain rule.
A Special Case
We’ll now consider a special case of the chain rule, when we have a composition of functions and . Note that is a scalar function, and we can think of as a curve in .
Let’s look at what the chain rule tells us in this case. For any , we have Writing in terms of its components, we have Since only has one input variable, we can rewrite this as Now that we’ve sorted out , let’s consider . Since is a scalar-valued function, will consist of only one row, For , we would evaluate these partial derivatives at : Now let’s turn our attention back to the composition . Putting together our results from above, we have
Since is a single variable function, its derivative matrix at only has one entry, which is . So, we can rewrite the above as This gives us a special case of the Chain Rule, that can be useful when we have a composition of functions .
Examples
Which composition(s) exist?
We’ll compute the derivative matrix in two ways: using the chain rule, and directly.
Let’s begin by using the chain rule. We’ll have so we’ll start by computing the derivative matrices and .
Now, for , we need to input into .
To compute , we multiply matrices, and obtain
Let’s verify our answer, by computing the directly, without using the chain rule. We’ll begin by finding .
This simplifies to . We can then compute the derivative matrix. We see that this gives the same result as using the chain rule.
Then, from the chain rule, we have
Since we then have that
and