You are about to erase your work on this activity. Are you sure you want to do this?
Updated Version Available
There is an updated version of this activity. If you update to the most recent version of this activity, then your current progress on this activity will be erased. Regardless, your record of completion will remain. How would you like to proceed?
Mathematical Expression Editor
factorization
Consider the equation:
This equation is unusual in that it involves two matrices on the left-hand side. If we
multiply the matrices together, we get
Gaussian elimination yields:
We conclude that
Now that we know the answer, we will turn to the original question and consider the
advantages of the format of the original problem. Observe that the two matrices have
a distinct pattern. The matrix on the left has zeros above the main diagonal, while
the matrix on the right has zeros below the main diagonal. Matrices that follow
such a pattern are called lower triangular and upper triangular matrices,
respectively.
Let . Then we can write the original equation as
The equation
corresponds to the system
This system can be easily solved by substitution, giving us
So,
Recall that we let . The equation
corresponds to the system
which can be easily solved by back-substitution:
Thus, . Observe that this is the same answer that we obtained in the beginning of the
problem using Gaussian elimination.
Given a matrix equation such as
of Exploration init:LUfactorization, often in practice we express the matrix on the left as a product of
an upper triangular matrix and a lower triangular matrix and use substitution to
solve the equation instead of using Gaussian elimination to find the solution. In
Exploration init:LUfactorization,
The process of taking a matrix and expressing it as a product of a lower triangular
matrix and an upper triangular matrix is called factorization. We also adopt the
(common) convention that is a matrix with a diagonal consisting entirely of ’s, which
is often called a lower unit triangular matrix.
Not every matrix has an factorization, but we will see that we can correct for
that.
Consider the system
We can express this system as the matrix equation , and we get
Unfortunately, for reasons we will explain below, it is not possible to find an
factorization of this coefficient matrix. However, if we simply swap the first two
equations:
we are able to find an factorization of the coefficient matrix. (How we actually
come up with an factorization of will be discussed after this example.)
To understand how an factorization can be used, it is helpful to think of this system
of equations as a matrix equation.
So if we let and also write , then we are able to solve for using forward-substitution.
See if you can do it!
Once we have the values of , since , all that remains is to solve the following matrix
equation using back-substitution.
By now you are quite used to solving this kind of problem...
Finding an factorization by Inspection
Find an factorization of
One way to find the factorizationis to simply look for it directly. We need By
multiplying the latter matrices we get and from this, it is a simple exercise to
determine each of the unknowns. For example, from the first column, we get and
then
See if you can continue to determine the entire factorization of .
Finding an factorization by the Multiplier Method
In the following example we demonstrate a method for computing an factorization
known as the multiplier method.
Find an factorization for
Write the matrix as the following product. In the matrix on the right, we
begin with the left row and zero out the entries below the top using the row
operation which involves adding a multiple of a row to another row. You
do this and also update the matrix on the left so that the product will be
unchanged.
Here is the first step. We take times the top row and add to the second. Then take
times the top row and add to the second in the matrix on the left. This gives us
The next step is to take times the top row and add to the bottom in the matrix
on the right. To ensure that the product is unchanged, we place a in the
bottom left corner in the matrix on the left. Thus the second step yields
For our final step, we take times the middle row on right and add to bottom row.
Updating the matrix on the left in the manner we did earlier, we get At this point,
we can stop. We have an factorization of the original matrix. We can always multiply
the matrices to check, if we wish.
Notice that when we perform the multiplier method, we are making repeated use of a
certain type of elementary row operation, namely, we are adding a scalar multiple of
one row to a row below it. The reason this helps to create an factorization
depends upon the fact that the elementary matrices corresponding to such
operations are lower triangular. To understand how this works, we begin with a
lemma.
Let be a lower (upper) triangular matrix which has ones down the main diagonal.
Then also is a lower (upper) triangular matrix which has ones down the main
diagonal. In the case that is of the form \begin{equation} \label{4nove1h} L= \begin{bmatrix} 1 & & & \\ a_{1} & 1 & & \\ \vdots & & \ddots & \\ a_{n} & & & 1 \end{bmatrix}, \end{equation}
where all entries are zero except for the left column and main diagonal, it is also the
case that is obtained from by simply multiplying each entry below the main
diagonal in by . The same is true if the single nonzero column is in another
position.
Proof
Consider the usual setup for finding the inverse . Each row operation
used on to transform this matrix to reduced row echelon form changes only
the entries in below the main diagonal. Whether we have the special case of
given in 4nove1h where the nonzero nondiagonal entries are in the left column, or if the
single nonzero column is in another position, it is clear that multiplication by
as described in the lemma gives us .
Now let be an matrix, say \begin{equation*} A= \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n} \\ a_{21} & a_{22} & \cdots & a_{2n} \\ \vdots & \vdots & & \vdots \\ a_{m1} & a_{m2} & \cdots & a_{mn} \end{bmatrix}, \end{equation*}
and assume can be row reduced to an upper triangular form using only row
operation 3. Thus, in particular, . Multiply on the left by \begin{equation*} E_{1}= \begin{bmatrix} 1 & 0 & \cdots & 0 \\ - \frac{a_{21}}{a_{11}} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -\frac{a_{m1}}{a_{11}} & 0 & \cdots & 1 \end{bmatrix} \end{equation*}
This is the product of elementary matrices which make modifications in the first
column only. It is equivalent to taking times the first row and adding to the second.
Then taking times the first row and adding to the third and so forth. The quotients
in the first column of the above matrix are the multipliers. Thus the result is of the
form \begin{equation*} E_{1}A= \begin{bmatrix} a_{11} & a_{12} & \cdots & a_{1n}^{\prime } \\ 0 & a_{22}^{\prime } & \cdots & a_{2n}^{\prime } \\ \vdots & \vdots & & \vdots \\ 0 & a_{m2}^{\prime } & \cdots & a_{mn}^{\prime } \end{bmatrix} \end{equation*}
By assumption, and so it is possible to use this entry to zero out all the entries
below it in the matrix on the right by multiplication by a matrix of the form
\begin{equation*} E_{2}= \begin{bmatrix} 1 & \vec{0} \\ \vec{0} & E \end{bmatrix} \end{equation*}
where is an matrix of the form \begin{equation*} E=\begin{bmatrix} 1 & 0 & \cdots & 0 \\ -\frac{a_{32}^{\prime }}{a_{22}^{\prime }} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -\frac{a_{m2}^{\prime }}{a_{22}^{\prime }} & 0 & \cdots & 1 \end{bmatrix} \end{equation*}
Again, the entries in the first column below the 1 are the multipliers. Continuing this
way, zeroing out the entries below the diagonal entries, finally leads us to
\begin{equation*} E_{m-1}E_{m-2}\cdots E_{1}A=U \end{equation*}
where is upper triangular. Each has all ones down the main diagonal and is lower
triangular. Now we can multiply both sides by the inverses of the in the reverse
order. This yields \begin{equation*} A=E_{1}^{-1}E_{2}^{-1}\cdots E_{m-1}^{-1}U \end{equation*}
By Lemma lem:multipliermethodtriangularmatrices, this implies that the product of those is a lower triangular matrix
having all ones down the main diagonal.
The above discussion and lemma explain how the multiplier method works. The
expressions \begin{equation*} -\frac{a_{21}}{a_{11}},-\frac{a_{31}}{a_{11}},\cdots , -\frac{a_{m1}}{a_{11}}, \end{equation*}
which we obtained in building , and which we denote respectively by to save
notation, are the multipliers. Therefore, according to the lemma, to find we simply
write \begin{equation*} \begin{bmatrix} 1 & 0 & \cdots & 0 \\ -M_{21} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -M_{m1} & 0 & \cdots & 1 \end{bmatrix} \end{equation*}
Similar considerations apply to the other Thus is a product of the form
\begin{equation*} \begin{bmatrix} 1 & 0 & \cdots & 0 \\ -M_{21} & 1 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ -M_{m1} & 0 & \cdots & 1 \end{bmatrix} \cdots \begin{bmatrix} 1 & 0 & \cdots & 0 \\ 0 & 1 & \cdots & 0 \\ \vdots & 0 & \ddots & \vdots \\ 0 & \cdots & -M_{m(m-1)} & 1 \end{bmatrix}, \end{equation*}
where each factor has at most one nonzero column, the position of which
moves from left to right as we scan the above product of matrices from left
to right. It follows from what we know about the effect of multiplying on
the left by an elementary matrix that the above product is of the form \begin{equation*} \begin{bmatrix} 1 & 0 & \cdots & 0 & 0 \\ -M_{21} & 1 & \cdots & 0 & 0 \\ \vdots & -M_{32} & \ddots & \vdots & \vdots \\ -M_{(M-1)1} & \vdots & \cdots & 1 & 0 \\ -M_{M1} & -M_{M2} & \cdots & -M_{MM-1} & 1 \end{bmatrix} \end{equation*}
To sum up the procedure, beginning at the left column and moving toward
the right, you simply insert, into the corresponding position in the identity
matrix, times the multiplier which was used to zero out an entry in that
position below the main diagonal in while retaining the main diagonal which
consists entirely of ones. This will give us as we create using row operation
3.
We now return to Example ex:usingLU, to understand why we could not use the multiplier
method to find an factorization of the coefficient matrix. Suppose we had
written
Our first step of Gaussian elimination would require a row swap to get a
nonzero entry into the top left corner (we swapped rows 1 and 2 in Example
ex:usingLU). Unfortunately, the elementary matrix that accomplishes this, is NOT
lower triangular. So we cannot apply Lemma lem:multipliermethodtriangularmatrices to generate a lower triangular
.
If the elementary matrices used to reduce our matrix to row-echelon form are all
lower triangular, then we can find an factorization. But what about the general
case?
Suppose an matrix is transformed to a row-echelon matrix using Gaussian
elimination. Let be the elementary matrices corresponding (in order) to
the row interchanges used, and write . (If no interchanges are used take .)
Then:
(a)
is the matrix obtained from by doing these interchanges (in order) to A.
(b)
has an factorization.
For a proof of the above theorem, the reader is referred to [Nicholson].
The matrix in the above theorem is called a permutation matrix. These matrices
have other important applications, as we will see later.
Practice Problems
Use the given factorization of the coefficient matrix to solve the system of
equations.
Observe that an factorization of the coefficient matrix is
Find the factorization of the coefficient matrix and use it to solve the system of
equations.
Is there only one factorization for a given matrix?
Consider the equation
Look for all possible factorizations.
Can you show that every permutation matrix is invertible? If so, What does the
inverse of a permutation matrix look like? (Recall that a permutation matrix is
matrix of Theorem th:LUPA.)