In this section, we will study optimization subject to constraints. For example, suppose we would like the find the maximum value of subject to the constraint . We can think of this as restricting ourselves to the line , and finding the maximum value of over this line.
These types of problems often arise in real-world applications: you have some limitation on resources, which provide the constraint, and want to optimize some sort of performance. A classical problem from geometry and single variable calculus is to maximize the area of a rectangle, subject to the constraint that the perimeter is . Setting up the problem algebraically, this is equivalent to finding the maximum value of subject to the constraint . In single variable calculus, we could solve for in the second equation, and use this to reduce the objective function, , to a single variable function, which we would them optimize. That is, from , we would have
Differentiating , we get So, we have when . The second derivative of is , so has a local maximum at . Since is continuous on all of and has only one critical point, this means that has an absolute maximum at . So, the maximum value of is
For optimization subject to a constraint for multivariable functions in general, we’ll follow a similar process: use the constraint to make a substitution, and then optimize the resulting function. However, we’ll soon see that this can be much more complicated than it was in the single variable case, and it requires paying careful attention to constraints and domains.
Optimization with Constraints
Let’s consider a higher dimensional version of the problem from the introduction. That is, let’s try to find the maximum value of subject to the constraint that . Taking the constraint and solving for , we have . Using substitution, we can reduce the function to a function of two variables,
So, we’d like to optimize . In order to do this, we’ll find the critical points of . For this, we first find the gradient of . Solving , we find two critical points: and . Let’s use the Hessian matrix of to determine the behavior of at these critical points. At , the Hessian matrix is This corresponds to the quadratic form , which is indefinite. So, has a saddle point at .
At , the Hessian matrix is Using Sylvester’s Theorem, this matrix is negative definite. So, has a local maximum at . But does this mean that has an absolute maximum at ? Let’s look at the graph of to investigate.
We see that actually doesn’t have any absolute maximum (or minimum) - it’s values get arbitrarily large, and arbitrarily small. So, here we see an example where the multivariable situation is more nuanced than the single variable situation. Often, it will be difficult to determine if multivariable functions have absolute extrema using the methods that we’ve covered so far. However, in the case where we’re optimizing a continuous function over a compact region, we know that absolute extrema exist. Because of this fact, we will focus most of our attention on this special situation.
With that in mind, let’s consider a revised version of our previous problem.
If we consider the set of such that , and , we see that this is a triangle in , and that it is a compact region. Since is a continuous function, the Extreme Value Theorem guarantees that it will achieve and absolute maximum and minimum. We could optimize this function using our earlier methods for compact regions, but we’ll instead optimize using substitution.
Substituting , we reduce to a function of two variables,
Keeping our previous constraints in mind, we want to optimize over the region . Graphing this region in , we see that is a compact region, hence has an absolute maximum and absolute minimum over .
We find the gradient of to find the critical points in , and solving , we find two critical points: and . Evaluating at these critical points, we find
Now, let’s examine on the boundary of . The boundary of consists of three line segments, along which we have , , or . In each of these cases, , so is zero along the entire boundary of .
Comparing the values that we’ve found, we see that has an absolute maximum of at , and an absolute minimum of which occurs along the line segments , , and .
Translating this back to our original function , the absolute maximum of subject to the constraint for nonnegative is , which occurs at the point . The absolute minimum is , which occurs when , , or .
We’ll now look at another example where optimizing a multivariable function subject to a constraint using substitution is deceptively difficult.
Since is a compact region, and is a continuous function, the Extreme Value Theorem tells us that will have an absolute maximum and minimum.
Let’s rewrite the constraint as , and substitute this into to reduce to a function of a single variable:
Differentiating , we obtain This has one critical point, . Taking the second derivative of , we have so has a local maximum at . Since only has one critical point and it’s a local maximum, has an absolute maximum at . Furthermore, has no absolute minimum.
However, we said that must have an absolute minimum, so something must be wrong! Here, we’ve lost track of some of the information contained in our constraint. Consider again Since for all , we must have . So, we should really be optimizing over the closed interval . We’ve already found the critical point, , of . Comparing the value of at this points and the endpoints of the interval, we have
So, we see that has an absolute maximum of at , and an absolute minimum of at and .
Translating this back to our original function, has an absolute maximum of at , and an absolute minimum of at .
We can see this reflected in the graph of over ; the edge of the saddle shape graphed below.
The moral of this section is: weird things can happen with multivariable functions, and you need to be very careful with constraints.