SVD Decomposition
We begin this section with an important definition.
Singular Value Decomposition (SVD) can be thought of as a generalization of orthogonal diagonalization of a symmetric matrix to an arbitrary matrix. This decomposition is the focus of this section.
The following is a useful result that will help when computing the SVD of matrices.
- Proof
- Suppose is an matrix, and suppose that is a nonzero eigenvalue of . Then
there exists a nonzero vector such that \begin{equation} \label{nonzero} (A^TA)\vec{x}=\lambda \vec{x} \end{equation}
Multiplying both sides of this equation by yields: \begin{eqnarray*} A(A^TA)\vec{x} & = & A\lambda \vec{x}\\ (AA^T)(A\vec{x}) & = & \lambda (A\vec{x}) \end{eqnarray*}Since and , , and thus by equation (nonzero), ; thus , implying that .
Therefore is an eigenvector of corresponding to eigenvalue . An analogous argument can be used to show that every nonzero eigenvalue of is an eigenvalue of , thus completing the proof.
Given an matrix , we will see how to express as a product where
- is an orthogonal matrix whose columns are eigenvectors of .
- is an orthogonal matrix whose columns are eigenvectors of .
- is an matrix whose only nonzero values lie on its main diagonal, and are the singular values of .
How can we find such a decomposition? We are aiming to decompose in the following form: \begin{equation*} A=U\left [ \begin{array}{cc} \sigma & 0 \\ 0 & 0 \end{array} \right ] V^T \end{equation*} where is a block matrix of the form Thus and it follows that \begin{equation*} A^TA=V\left [ \begin{array}{cc} \sigma & 0 \\ 0 & 0 \end{array} \right ] U^TU\left [ \begin{array}{cc} \sigma & 0 \\ 0 & 0 \end{array} \right ] V^T=V\left [ \begin{array}{cc} \sigma ^{2} & 0 \\ 0 & 0 \end{array} \right ] V^T \end{equation*} and so Similarly, Therefore, you would find an orthonormal basis of eigenvectors for make them the columns of a matrix such that the corresponding eigenvalues are decreasing. This gives You could then do the same for to get .
We formalize this discussion in the following theorem.
- Proof
- There exists an orthonormal basis, such that where for and equals zero if
Thus for because \begin{equation*} A\vec{v}_{i}\dotp A\vec{v}_{i} = (A\vec{v}_i)^T(A\vec{v}_i)=\vec{v}_i^T(A^TA\vec{v}_i)=\vec{0} \end{equation*}
For define by \begin{equation*} \vec{u}_{i}= \sigma _{i}^{-1}A\vec{v}_{i} \end{equation*}
Thus Now \begin{eqnarray*} \vec{u}_{i} \dotp \vec{u}_{j} &=& \sigma _{i}^{-1}A \vec{v}_{i} \dotp \sigma _{j}^{-1}A\vec{v}_{j} = \sigma _{i}^{-1}\vec{v}_{i}^T \sigma _{j}^{-1}A^TA\vec{v}_{j} \\ &=& \sigma _{i}^{-1}\vec{v}_{i} \dotp \sigma _{j}^{-1}\sigma _{j}^{2} \vec{v}_{j} = \frac{\sigma _{j}}{\sigma _{i}}\left ( \vec{v}_{i} \dotp \vec{v}_{j}\right ) \end{eqnarray*}This means that when and when . Thus is an orthonormal set of vectors in Also, \begin{equation*} AA^T\vec{u}_{i}=AA^T\sigma _{i}^{-1}A\vec{v}_{i}=\sigma _{i}^{-1}AA^TA\vec{v}_{i}=\sigma _{i}^{-1}A\sigma _{i}^{2}\vec{v} _{i}=\sigma _{i}^{2}\vec{u}_{i} \end{equation*} Now, using Gram-Schmidt, extend to an orthonormal basis for all of and let \begin{equation*} U= \left [ \begin{array}{ccc} \vec{u}_{1} & \cdots & \vec{u}_{m} \end{array} \right ] \end{equation*} while Thus is the matrix which has the as columns and is defined as the matrix which has the as columns. Then \begin{equation*} U^TAV=\left [ \begin{array}{c} \vec{u}_{1}^T \\ \vdots \\ \vec{u}_{k}^T \\ \vdots \\ \vec{u}_{m}^T \end{array} \right ] A \left [ \begin{array}{ccc} \vec{v}_{1} & \cdots & \vec{v}_{n}\end{array}\right ] \end{equation*} \begin{equation*} =\left [ \begin{array}{c} \vec{u}_{1}^T \\ \vdots \\ \vec{u}_{k}^T \\ \vdots \\ \vec{u}_{m}^T \end{array} \right ] \left [ \begin{array}{cccccc} \sigma _{1}\vec{u}_{1} & \cdots & \sigma _{k}\vec{u}_{k} & \vec{0} & \cdots & \vec{0} \end{array} \right ] =\left [ \begin{array}{cc} \sigma & 0 \\ 0 & 0 \end{array} \right ] \end{equation*} where is given in the statement of the theorem.
The SVD has as an immediate corollary which is given in the following interesting result.
Let’s compute the SVD of a simple matrix.
Since is while is , and and have the same nonzero eigenvalues (by Lemma lem:samenonzeroeigenvalues), we compute the characteristic polynomial (because it is easier to compute than ).
The eigenvalues of are , , and , and the singular values of are and . By convention, we list the eigenvalues (and corresponding singular values) in non increasing order (i.e., from largest to smallest).
To find the matrix :
To construct the matrix we need to find eigenvectors for . Since the eigenvalues of are distinct, the corresponding eigenvectors are orthogonal, and we need only normalize them.
: solve .
: solve .
: solve . Let Then Also, and we use , , and to find . Since is orthogonal and , it follows that . Let , and let , where and are the two columns of . Then we have
Here is another example.
Thus has eigenvalue , and the eigenvalues of are , , and . Furthermore, has only one singular value, .
To find the matrix : To do so we find an eigenvector for and normalize it. In this case, finding a unit eigenvector is trivial: , and Also, , and we use , , and to find .
Now , with , and , where , , and are the columns of . Thus
The vectors and are eigenvectors of corresponding to the eigenvalue . Instead of solving the system and then using the Gram-Schmidt process on the resulting set of two basic eigenvectors, the following approach may be used.
Find vectors and by first extending to a basis of , then using the Gram-Schmidt algorithm to orthogonalize the basis, and finally normalizing the vectors.
Starting with instead of makes the arithmetic a bit easier. It is easy to verify that
is a basis of . Set
and apply the Gram-Schmidt algorithm to . This gives us
Therefore, and Finally,
Consider another example.
\begin{equation*} \mathcal{S}_0=\mbox{span}\left ( \left [ \begin{array}{c} 0 \\ 0 \\ 1 \end{array} \right ] ,\left [ \begin{array}{c} -\frac{2}{5}\sqrt{5} \\ \frac{1}{5}\sqrt{5} \\ 0 \end{array} \right ] \right ), \quad \mathcal{S}_{16}=\mbox{span}\left ( \left [ \begin{array}{c} \frac{1}{5}\sqrt{5} \\ \frac{2}{5}\sqrt{5} \\ 0 \end{array} \right ] \right ) \end{equation*} Thus the matrix is given by \begin{equation*} V=\left [ \begin{array}{ccc} \frac{1}{5}\sqrt{5} & -\frac{2}{5}\sqrt{5} & 0 \\ \frac{2}{5}\sqrt{5} & \frac{1}{5}\sqrt{5} & 0 \\ 0 & 0 & 1 \end{array} \right ] \end{equation*} Next consider \begin{equation*} \left [ \begin{array}{cc} 8 & 8 \\ 8 & 8 \end{array} \right ] \end{equation*} Eigenvalues are and , and eigenspaces are
\begin{equation*} \mathcal{S}_0=\mbox{span}\left (\left [ \begin{array}{c} -\frac{1}{2}\sqrt{2} \\ \frac{1}{2}\sqrt{2} \end{array} \right ] \right ),\quad \mathcal{S}_{16}=\mbox{span}\left ( \left [ \begin{array}{c} \frac{1}{2}\sqrt{2} \\ \frac{1}{2}\sqrt{2} \end{array} \right ] \right ) \end{equation*} Thus you can let be given by \begin{equation*} U=\left [ \begin{array}{cc} \frac{1}{2}\sqrt{2} & -\frac{1}{2}\sqrt{2} \\ \frac{1}{2}\sqrt{2} & \frac{1}{2}\sqrt{2} \end{array} \right ] \end{equation*} Let’s check this. \begin{equation*} \left [ \begin{array}{cc} \frac{1}{2}\sqrt{2} & \frac{1}{2}\sqrt{2} \\ -\frac{1}{2}\sqrt{2} & \frac{1}{2}\sqrt{2} \end{array} \right ] \left [ \begin{array}{ccc} \frac{2}{5}\sqrt{2}\sqrt{5} & \frac{4}{5}\sqrt{2}\sqrt{5} & 0 \\ \frac{2}{5}\sqrt{2}\sqrt{5} & \frac{4}{5}\sqrt{2}\sqrt{5} & 0 \end{array} \right ] \left [ \begin{array}{ccc} \frac{1}{5}\sqrt{5} & -\frac{2}{5}\sqrt{5} & 0 \\ \frac{2}{5}\sqrt{5} & \frac{1}{5}\sqrt{5} & 0 \\ 0 & 0 & 1 \end{array} \right ] \end{equation*} \begin{equation*} =\left [ \begin{array}{ccc} 4 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right ] \end{equation*}
This illustrates that if you have a good way to find the eigenvectors and eigenvalues for a Hermitian matrix which has nonnegative eigenvalues, then you also have a good way to find the SVD of an arbitrary matrix.
Practice Problems
Text Source
This section was adapted from Section 8.11 of Keith Nicholson’s Linear Algebra with Applications. (CC-BY-NC-SA)
W. Keith Nicholson, Linear Algebra with Applications, Lyryx 2018, Open Edition.
2024-09-22 18:38:55