You are about to erase your work on this activity. Are you sure you want to do this?
Updated Version Available
There is an updated version of this activity. If you update to the most recent version of this activity, then your current progress on this activity will be erased. Regardless, your record of completion will remain. How would you like to proceed?
Mathematical Expression Editor
Diagonalizable Matrices and Multiplicity
Recall that a diagonal matrix is a matrix containing a zero in every entry except
those on the main diagonal. More precisely, if is the entry of a diagonal matrix ,
then unless . Such matrices look like the following. \begin{equation*} D = \begin{bmatrix} * & & 0 \\ & \ddots & \\ 0 & & * \end{bmatrix} \end{equation*}
where is a number which might not be zero.
Diagonal matrices have some nice properties, as we demonstrate below.
Let and let . Compute and using Octave.
% Define M
M=[1 2 3; 4 5 6; 7 8 9]
% The function diag(vector) creates a square matrix with diagonal
%vector and the remaining entries zero.
D=diag([2 -5 10])
% The function diag() can also be used to return the diagonal of a matrix.
diag_M=diag(M)
% Understanding multiplication by a diagonal matrix
product_MD=M*D
product_DM=D*M
Notice the patterns present in the product matrices.
Each row of is the same as its corresponding row of multiplied by the
scalar which is the corresponding diagonal element of .
In the product , it is the columns of that have been multiplied by the
diagonal elements.
Raising to the power amounts to raising each diagonal element to the
power.
These patterns hold in general for any diagonal matrix, and they are fundamental
to understanding diagonalization and its uses. We discuss diagonalization
next.
Let be an matrix. Then is said to be diagonalizable if there exists an invertible
matrix such that \begin{equation*} P^{-1}AP=D \end{equation*}
where is a diagonal matrix. In other words, a matrix is diagonalizable if it is similar
to a diagonal matrix, .
If we are given a matrix that is diagonalizable, then we can write for some matrix ,
or, equivalently, \begin{equation} \label{eq:understand_diag} AP=PD \end{equation}
If we pause to examine Equation (eq:understand_diag), the work that we did in Exploration init:multiplydiag can help us
to understand how to find that will diagonalize . The product is formed by
multiplying each column of by a scalar which is the corresponding element on the
diagonal of . To restate this, if is column in our matrix , then Equation (eq:understand_diag) tells us
that \begin{equation} \label{eq:ev_ew_diag} A \vec{x}_i = \lambda _i \vec{x}_i, \end{equation}
where is the th diagonal element of .
Of course, Equation (eq:ev_ew_diag) is very familiar! We see that if we are able to diagonalize a
matrix , the columns of matrix will be the eigenvectors of , and the corresponding
diagonal entries of will be the corresponding eigenvalues of . This is summed up in
the following theorem.
An matrix is diagonalizable if and only if there is an invertible matrix given
by
where the columns are eigenvectors of .
Moreover, if is diagonalizable, the corresponding eigenvalues of are the diagonal
entries of the diagonal matrix .
Proof
Suppose is given as above as an invertible matrix whose columns are
eigenvectors of . To show that is diagonalizable, we will show
which is equivalent to . We have
while \begin{equation*} PD=\begin{bmatrix} | & | & & | \\ \vec{x}_1 & \vec{x}_2 & \cdots & \vec{x}_n \\ | & | & & | \end{bmatrix} \begin{bmatrix} \lambda _{1} & & 0 \\ & \ddots & \\ 0 & & \lambda _{n} \end{bmatrix}=\begin{bmatrix} | & | & & | \\ \lambda _{1}\vec{x}_1 & \lambda _{2}\vec{x}_2 & \cdots & \lambda _{n}\vec{x}_n \\ | & | & & | \end{bmatrix} \end{equation*}
We can complete this half of the proof by comparing columns, and noting that
\begin{equation} A \vec{x}_i = \lambda _i \vec{x}_i \end{equation}
for since the are eigenvectors of and the are corresponding eigenvalues of
.
Conversely, suppose is diagonalizable so that Let \begin{equation*} P=\begin{bmatrix} | & | & & | \\ \vec{x}_1 & \vec{x}_2 & \cdots & \vec{x}_n \\ | & | & & | \end{bmatrix} \end{equation*}
where the columns are the vectors and \begin{equation*} D=\begin{bmatrix} \lambda _{1} & & 0 \\ & \ddots & \\ 0 & & \lambda _{n} \end{bmatrix} \end{equation*}
Then \begin{equation*} AP=PD=\begin{bmatrix} | & | & & | \\ \vec{x}_1 & \vec{x}_2 & \cdots & \vec{x}_n \\ | & | & & | \end{bmatrix} \begin{bmatrix} \lambda _{1} & & 0 \\ & \ddots & \\ 0 & & \lambda _{n} \end{bmatrix} \end{equation*}
and so \begin{equation*} \begin{bmatrix} | & | & & | \\ A\vec{x}_1 & A\vec{x}_2 & \cdots & A\vec{x}_n \\ | & | & & | \end{bmatrix} =\begin{bmatrix} | & | & & | \\ \lambda _{1}\vec{x}_1 & \lambda _{2}\vec{x}_2 & \cdots & \lambda _{n}\vec{x}_n \\ | & | & & | \end{bmatrix} \end{equation*}
showing the are eigenvectors of and the are eigenvalues.
Notice that because the matrix defined above is invertible it follows that the set of
eigenvectors of , , is a basis of .
We demonstrate the concept given in the above theorem in the next example. Note
that not only are the columns of the matrix formed by eigenvectors, but must
be invertible, and therefore must consist of a linearly independent set of
eigenvectors.
Let \begin{equation*} A=\begin{bmatrix} 2 & 0 & 0 \\ 1 & 4 & -1 \\ -2 & -4 & 4 \end{bmatrix} \end{equation*}
Find an invertible matrix and a diagonal matrix such that .
We will use eigenvectors of as the columns of , and the corresponding eigenvalues of
as the diagonal entries of . The eigenvalues of are , and . We leave these
computations as exercises, as well as the computations to find a basis for each
eigenspace.
One possible basis for , the eigenspace corresponding to , is , while a basis for is
given by .
We construct the matrix by using these basis elements as columns. \begin{equation*} P=\begin{bmatrix} -2 & 1 & 0 \\ 1 & 0 & 1 \\ 0 & 1 & -2 \end{bmatrix} \end{equation*}
You can verify (See Practice Problem ) that \begin{equation*} P^{-1}=\begin{bmatrix} -1/4 & 1/2 & 1/4 \\ 1/2 & 1 & 1/2 \\ 1/4 & 1/2 & -1/4 \end{bmatrix} \end{equation*}
Thus,
You can see that the result here is a diagonal matrix where the entries on the
main diagonal are the eigenvalues of . Notice that eigenvalues on the main
diagonal must be in the same order as the corresponding eigenvectors in
.
We will continue to use matrices from Example ex:diagonalizematrix to illustrate why diagonalizable
matrices are easier to work with. Let and let . Would it be easier to compute or if
you had to do so by hand, without a computer? Certainly is easier, due to the
number of zero entries!
Raising a diagonal matrix to a power amounts to simply raising each entry to that
same power, whereas computing requires many more calculations. However, we
learned in Example ex:diagonalizematrix that is similar to . We will use this to make our computation
easier as follows \begin{align*} A^5&=\left (PDP^{-1}\right )^5 \\ &=(PDP^{-1})(PDP^{-1})(PDP^{-1})(PDP^{-1})(PDP^{-1}) \\ &=PD(P^{-1}P)D(P^{-1}P)D(P^{-1}P)D(P^{-1}P)DP^{-1} \\ &=PD(I)D(I)D(I)D(I)DP^{-1} \\ &=PD^5P^{-1} \end{align*}
With this in mind, it is not as daunting to calculate by hand.
The savings in work would certainly be more pronounced for larger matrices or for
powers larger that 5.
In Exploration exp:motivate_diagonalization, because matrix was diagonalizable, we were able to cut
down on computations. When we chose to work with and instead of we
worked with the eigenvalues and eigenvectors of . Each column of is an
eigenvector of , and so we repeatedly made use of the following theorem (with
).
Let be an matrix and suppose . Then
Proof
We prove this theorem by induction on . Clearly holds when ,
as that was given. For the inductive step, suppose that we know . Then \begin{align*} A^m \vec{x} &= (A A^{m-1}) \vec{x} \\ &= A (A^{m-1} \vec{x}) \\ &= A (\lambda ^{m-1} \vec{x}) \\ &= \lambda ^{m-1} A\vec{x} \\ &= \lambda ^{m-1} \lambda \vec{x} \\ &= \lambda ^m \vec{x} \end{align*}
as desired.
Matrix from the Example ex:diagonalizematrix and Exploration exp:motivate_diagonalization had a repeated eigenvalue of 2. The
next theorem and corollary show that matrices which have distinct eigenvalues
(where none are repeated) have desirable properties.
Let be an matrix, and suppose that has distinct eigenvalues . For each , let be a
-eigenvector of . Then is linearly independent.
Proof
We prove this by induction on , the number of vectors in the set. If ,
then is a linearly independent set because . In general, suppose we have
established that the theorem is true for some . Given eigenvectors , suppose
\begin{equation} \label{eq:thm_proof_5_5_4_1} c_1\vec{x}_1 + c_2\vec{x}_2 + \dots + c_{m+1}\vec{x}_{m+1} = \vec{0}. \end{equation}
We must show that each . Multiply both sides of (eq:thm_proof_5_5_4_1) on the left by and use the fact
that to get \begin{equation} \label{eq:thm_proof_5_5_4_2} c_1\lambda _1\vec{x}_1 + c_2\lambda _2\vec{x}_2 + \dots + c_{m+1}\lambda _{m+1}\vec{x}_{m+1} = \vec{0}. \end{equation}
If we multiply (eq:thm_proof_5_5_4_1) by and subtract the result from (eq:thm_proof_5_5_4_2), the first terms cancel and we
obtain \begin{equation*} c_2(\lambda _2 - \lambda _1)\vec{x}_2 + c_3(\lambda _3 - \lambda _1)\vec{x}_3 + \dots + c_{m+1}(\lambda _{m+1} - \lambda _1)\vec{x}_{m+1} = \vec{0}. \end{equation*}
Since correspond to distinct eigenvalues , the set is linearly independent by the
induction hypothesis. Hence, \begin{equation*} c_2(\lambda _2 - \lambda _1) = 0, \quad c_3(\lambda _3 - \lambda _1) = 0, \quad \dots , \quad c_{m+1}(\lambda _{m+1} - \lambda _1) = 0 \end{equation*}
and so because the are distinct. It follows that (eq:thm_proof_5_5_4_1) becomes , which implies that
because , and the proof is complete.
The corollary that follows from this theorem gives a useful tool in determining if is
diagonalizable.
Let be an matrix and suppose it has distinct eigenvalues. Then it follows that is
diagonalizable.
Note that Corollary th:distincteigenvalues is NOT an “if and only if statement”. This means that if has
repeated eigenvalues it is still sometimes possible to diagonalize , as seen in Example
ex:diagonalizematrix.
If we are able to diagonalize , say , we say that is an eigenvalue decomposition of
.
Not every matrix has an eigenvalue decomposition. Sometimes we cannot find an
invertible matrix such that . Consider the following example.
Let \begin{equation*} A = \begin{bmatrix} 1 & 1 \\ 0 & 1 \end{bmatrix} \end{equation*}
If possible, find an invertible matrix and a diagonal matrix so that .
We see immediately (how?) that the eigenvalues of are and . To find , the next step
would be to find a basis for the corresponding eigenspace . We solve the equation .
Writing this equation as an augmented matrix, we already have a matrix in row
echelon form: \begin{equation*} \left [\begin{array}{cc|c} 0 & -1 & 0 \\ 0 & 0 & 0 \end{array}\right ] \end{equation*}
We see that the eigenvectors in are of the form
so a basis for the eigenspace is given by . It is easy to see that we cannot form an
invertible matrix , because any two eigenvectors will be of the form , and so the
second row of would have a row of zeros, and could not be invertible. Hence cannot
be diagonalized.
We saw earlier in Corollary th:distincteigenvalues that an matrix with distinct eigenvalues
is diagonalizable. It turns out that there are other useful diagonalizability
tests.
Recall that the algebraic multiplicity of an eigenvalue is the number of times that it
occurs as a root of the characteristic polynomial.
The geometric multiplicity of an eigenvalue is the dimension of the corresponding
eigenspace .
Consider now the following lemma.
Let be an matrix, and let be the eigenspace corresponding to the eigenvalue which
has algebraic multiplicity . Then
In other words, the geometric multiplicity of an eigenvalue is less than or equal to the
algebraic multiplicity of that same eigenvalue.
Proof
Let be the geometric multiplicity of , i.e., . Suppose is a basis for the
eigenspace . Let be any invertible matrix having as its first columns, say
In block form we may write
where is , is , is , and is . We observe . This implies
Therefore, \begin{equation*} P^{-1}AP=\begin{bmatrix} D \\ E \end{bmatrix} A \begin{bmatrix} B&C \end{bmatrix} = \left [\begin{array}{c|c} DAB & DAC \\ \hline EAB & EAC \end{array}\right ] \end{equation*}
\begin{equation*} = \left [\begin{array}{c|c} \lambda _1 DB & DAC \\ \hline \lambda _1 EB & EAC \end{array}\right ] = \left [\begin{array}{c|c} \lambda _1 I_k & DAC \\ \hline O & EAC \end{array}\right ] \end{equation*}
We finish the proof by comparing the characteristic polynomials on both sides of this
equation, and making use of the fact that similar matrices have the same
characteristic polynomials.
We see that the characteristic polynomial of has as a factor. This tells us that
algebraic multiplicity of is at least , proving the desired inequality.
This result tells us that if is an eigenvalue of , then the number of linearly
independent -eigenvectors is never more than the multiplicity of . We now use this
fact to provide a useful diagonalizability condition.
Let be an matrix . Then is diagonalizable if and only if for each eigenvalue
of , the algebraic multiplicity of is equal to the geometric multiplicity of
.
Proof
Suppose is diagonalizable and let be the distinct eigenvalues of , with
algebraic multiplicities , respectively and geometric multiplicities , respectively.
Since is diagonalizable, Theorem th:eigenvectorsanddiagonalizable implies that . By applying Lemma lemma:dimeigenspace times,
we have
which is only possible if for .
Conversely, if the geometric multiplicity equals the algebraic multiplicity of
each eigenvalue, then obtaining a basis for each eigenspace yields eigenvectors.
Applying Theorem th:linindepeigenvectors, we know that these eigenvectors are linearly independent,
so Theorem th:eigenvectorsanddiagonalizable implies that is diagonalizable.