QR Factorization

One of the main virtues of orthogonal matrices is that they can be easily inverted—the transpose is the inverse. This fact, combined with the factorization theorem in this section, provides a useful way to simplify many matrix calculations (for example, in least squares approximation).

The importance of the factorization lies in the fact that there are computer algorithms that accomplish it with good control over round-off error, making it particularly useful in matrix calculations. The factorization is a matrix version of the Gram-Schmidt process.

Suppose is an matrix with linearly independent columns . The Gram-Schmidt algorithm can be applied to these columns to provide orthogonal columns where and \begin{equation*} \vec{f}_{k} = \vec{c}_{k} - \frac{\vec{c}_{k} \dotp \vec{f}_{1}}{\norm{ \vec{f}_{1} }^2}\vec{f}_{1} + \frac{\vec{c}_{k} \dotp \vec{f}_{2}}{\norm{ \vec{f}_{2} }^2}\vec{f}_{2} - \dots - \frac{\vec{c}_{k} \dotp \vec{f}_{k-1}}{\norm{ \vec{f}_{k-1} }^2}\vec{f}_{k-1} \end{equation*} for each . Now write for each . Then are orthonormal columns, and the above equation becomes \begin{equation*} \norm{ \vec{f}_{k} } \vec{q}_{k} = \vec{c}_{k} - (\vec{c}_{k} \dotp \vec{q}_{1})\vec{q}_{1} - (\vec{c}_{k} \dotp \vec{q}_{2})\vec{q}_{2} - \dots - (\vec{c}_{k} \dotp \vec{q}_{k-1})\vec{q}_{k-1} \end{equation*} Using these equations, express each as a linear combination of the : \begin{equation*} \begin{array}{ccl} \vec{c}_{1} &=& \norm{ \vec{f}_{1} } \vec{q}_{1} \\ \vec{c}_{2} &=& (\vec{c}_{2} \dotp \vec{q}_{1})\vec{q}_{1} + \norm{ \vec{f}_{2} } \vec{q}_{2} \\ \vec{c}_{3} &=& (\vec{c}_{3} \dotp \vec{q}_{1})\vec{q}_{1} + (\vec{c}_{3} \dotp \vec{q}_{2})\vec{q}_{2} + \norm{ \vec{f}_{3} } \vec{q}_{3} \\ \vdots && \vdots \\ \vec{c}_{n} &=& (\vec{c}_{n} \dotp \vec{q}_{1})\vec{q}_{1} + (\vec{c}_{n} \dotp \vec{q}_{2})\vec{q}_{2} + (\vec{c}_{n} \dotp \vec{q}_{3})\vec{q}_{3} + \dots + \norm{ \vec{f}_{n} } \vec{q}_{n} \end{array} \end{equation*} These equations have a matrix form that gives the required factorization: \begin{align} A & = \left [ \begin{array}{ccccc} |&|&|& &| \\ \vec{c}_{1} & \vec{c}_{2} & \vec{c}_{3} &\cdots & \vec{c}_{n}\\ |&|&|& &| \end{array}\right ] \nonumber \\ &= \left [ \begin{array}{ccccc} |&|&|& &| \\ \vec{q}_{1} & \vec{q}_{2} & \vec{q}_{3} & \cdots & \vec{q}_{n}\\ |&|&|& &| \end{array}\right ] \left [ \begin{array}{ccccc} \norm{ \vec{f}_{1} } & \vec{c}_{2} \dotp \vec{q}_{1} & \vec{c}_{3} \dotp \vec{q}_{1} & \cdots & \vec{c}_{n} \dotp \vec{q}_{1} \\ 0 & \norm{ \vec{f}_{2} } & \vec{c}_{3} \dotp \vec{q}_{2} & \cdots & \vec{c}_{n} \dotp \vec{q}_{2} \\ 0 & 0 & \norm{ \vec{f}_{3} } & \cdots & \vec{c}_{n} \dotp \vec{q}_{3} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & \norm{ \vec{f}_{n} } \end{array} \right ] \label{matrixFactEq} \end{align}

Here the first factor has orthonormal columns, and the second factor is an upper triangular matrix with positive diagonal entries (and so is invertible). We record this in the following theorem.

The matrices and in Theorem th:QR-025133 are uniquely determined by ; we return to this below.

If a matrix has independent rows and we apply QR-factorization to , the result is:

Since a square matrix with orthonormal columns is orthogonal, we have

We now take the time to prove the uniqueness of the QR-factorization.

Proof
Write and in terms of their columns, and observe first that because and have orthonormal columns. Hence it suffices to show that (then ). Since , the equation gives ; for convenience we write this matrix as \begin{equation*} Q_{1}^TQ = R_{1}R^{-1} = \left [ \begin{array}{c} t_{ij} \end{array}\right ] \end{equation*} This matrix is upper triangular with positive diagonal elements (since this is true for and ), so for each and if . On the other hand, the -entry of is , so we have for all and . But each is in because . We know how to write a vector as a linear combination of an orthonormal basis (using Corollary cor:orthonormal from Orthogonality and Projections): \begin{equation*} \vec{c}_{j} = (\vec{d}_{1} \dotp \vec{c}_{j})\vec{d}_{1} + (\vec{d}_{2} \dotp \vec{c}_{j})\vec{d}_{2} + \dots + (\vec{d}_{n} \dotp \vec{c}_{j})\vec{d}_{n} = t_{1j}\vec{d}_{1} + t_{2j}\vec{d}_{2} + \dots + t_{jj}\vec{d}_{i}, \end{equation*} because if . The first few equations here are \begin{equation*} \begin{array}{ccl} \vec{c}_{1} &=& t_{11}\vec{d}_{1} \\ \vec{c}_{2} &=& t_{12}\vec{d}_{1} + t_{22}\vec{d}_{2} \\ \vec{c}_{3} &=& t_{13}\vec{d}_{1} + t_{23}\vec{d}_{2} + t_{33}\vec{d}_{3} \\ \vec{c}_{4} &=& t_{14}\vec{d}_{1} + t_{24}\vec{d}_{2} + t_{34}\vec{d}_{3} + t_{44}\vec{d}_{4} \\ \vdots && \vdots \end{array} \end{equation*} The first of these equations gives , whence . But then we have , so the second equation becomes . Now a similar argument gives , and then and follows in the same way. Hence and . Continue in this way to get for all . This proves that .

QR-Algorithm for approximating eigenvalues

In The Power Method and the Dominant Eigenvalue, we learned about an iterative method for computing eigenvalues. We also mentioned that a better method for approximating the eigenvalues of an invertible matrix depends on the QR-factorization of . While it is beyond the scope of this book to pursue a detailed discussion of this method, we give an example and conclude with some remarks on the QR-algorithm. The interested reader is referred to J. M. Wilkinson, The Algebraic Eigenvalue Problem (Oxford, England: Oxford University Press, 1965) or G. W. Stewart, Introduction to Matrix Computations (New York: Academic Press, 1973).

The QR-algorithm uses QR-factorization repeatedly to create a sequence of matrices as follows:

Shifting.

We first learned about the concept of shifting in The Power Method and the Dominant Eigenvalue. Convergence is accelerated if, at stage of the algorithm, a number is chosen and is factored in the form rather than itself. Then \begin{equation*} Q_{k}^{-1}A_{k}Q_{k} = Q_{k}^{-1}(Q_{k}R_{k} + \tau _{k}I)Q_{k} = R_{k}Q_{k} + \tau _{k}I \end{equation*} so we take . If the shifts are carefully chosen, convergence can be greatly improved.

Preliminary Preparation.

A matrix such as \begin{equation*} \left [ \begin{array}{rrrrr} \ast & \ast & \ast & \ast & \ast \\ \ast & \ast & \ast & \ast & \ast \\ 0 & \ast & \ast & \ast & \ast \\ 0 & 0 & \ast & \ast & \ast \\ 0 & 0 & 0 & \ast & \ast \\ \end{array}\right ] \end{equation*} is said to be in upper Hessenberg form, and the QR-factorizations of such matrices are greatly simplified. Given an matrix , a series of orthogonal matrices (called Householder matrices can be easily constructed such that \begin{equation*} B = H_{m}^T \cdots H_{1}^TAH_{1} \cdots H_{m} \end{equation*} is in upper Hessenberg form. Then the QR-algorithm can be efficiently applied to and, because is similar to , it produces the eigenvalues of .

Complex Eigenvalues.

If some of the eigenvalues of a real matrix are not real, the QR-algorithm converges to a block upper triangular matrix where the diagonal blocks are either (the real eigenvalues) or (each providing a pair of conjugate complex eigenvalues of ).

Practice Problems

In each case find the QR-factorization of .
(a)
(b)
(c)
(d)

Click the arrow to see answer.

(a)
,
(b)
,
If is upper triangular and invertible, show that there exists a diagonal matrix with diagonal entries such that is invertible, upper triangular, and has positive diagonal entries.
If has independent columns, let
where has orthonormal columns and is invertible and upper triangular. (Some authors do not require a -factorization to have positive diagonal entries.) Show that there is a diagonal matrix with diagonal entries such that is the QR-factorization of .
See Practice Problem prob:take_diag_positive
In each case, find the exact eigenvalues and then approximate them using the QR-algorithm.
(a)
(b)

Click the arrow to see answer.

(a)
Eigenvalues ,

, ,
,
,


If is symmetric, show that each matrix in the QR-algorithm is also symmetric. Deduce that they converge to a diagonal matrix.

Click the arrow to see answer.

Use induction on . If , . In general , so the fact that implies . The eigenvalues of are all real (Theorem cor:ews_symmetric_real), so the converge to an upper triangular matrix . But must also be symmetric (it is the limit of symmetric matrices), so it is diagonal.

Apply the QR-algorithm to
. Explain.
Given a matrix , let , , and , , be the matrices constructed in the QR-algorithm. Show that for each and hence that this is a QR-factorization of .
Show that for each , and use this equality to compute “from the centre out.” Use the fact that for any square matrices and .

Text Source

This section was adapted from Sections 8.4 and 8.5 of Keith Nicholson’s Linear Algebra with Applications. (CC-BY-NC-SA)

W. Keith Nicholson, Linear Algebra with Applications, Lyryx 2018, Open Edition, pp. 437–445.

2024-09-07 16:13:02