Algorithms: Krylov subspaces, Arnoldi iteration and extreme eigenvalues

The most powerful iterative solvers for linear equations and eigenvalues are based in part on the types of operations performed in the Lanczos algorithm. Once a matrix has been transformed by similarity transformations into a triangular form

$O\cdot M\cdot O^{-1}=T$

the whole game is over; eigenvalues and inverses can be extracted easily (we did so in an earlier post). Unfortunately it is often not possible to get an arbitrary $M$ into triangular form $T$ by similarity transformations, even if many sequential transformations are used. This is discussed in our section on Jordan normal forms. However any matrix $M$ can be put into Hessenberg form $H$ by something much like a similarity transformation.

$H$ is lower-Hessianberg if $H_{ij}=0$ for all $j>i+1$ , and is upper-Hessianberg if $H_{ij}=0$ for all $i>j+1$ . We will discuss Householder transformations in another post which provide you with a way of constructing orthogonal transformations that put $M$ into Hessianberg form.

Arnoldi iteration uses Kyrlov subspaces to transform $M$ into an upper-Hessianberg form $H$ . Given an $N\times N$ matrix $M$ , start with a random vector $\mathbf{v}_1$ , and create the vectors $\{\mathbf{v}_1, \, M\cdot \mathbf{v}_1, \, M^2\cdot \mathbf{v}_1, \, \cdots, \, M^{m}\cdot \mathbf{v}_1\}$ . If the set of vectors is subjected to a Gram-Schmidt orthogonalization, the resulting unit vectors form a basis for the Krylov subspace $K_m(M, \mathbf{v}_1)$ , which may be a subspace of $\mathbb{R}^N$ because typically $m<<N$ .

Upper-Hessianberg matrices have properties that make inversion and eigenvalue extraction simple. Given an arbitrary, possibly complex $M$ ( $N\times N$ ) with $N$ potentially very large, the direct approach of finding unitary $O$ such that $M=O\cdot H\cdot O^\dagger$ where $H$ is upper-Hessianberg (full factorization $M\cdot O=O\cdot H$ ) might be out of the question, so instead one tries to work with the leftmost $m$ columns where $m<<N$ , then in the factorization

$M\cdot O_m=O_{m+1}\cdot H_m$

the $O_m$ matrix will be $N\times m$ , being of the form $O_m=(\mathbf{v}_1, \, \mathbf{v}_2, \, \cdots, \, \mathbf{v}_m)$ where each vector $\mathbf{v}_i$ is an $N$ -vector (after all on the left $M$ acts on $N$ -vectors as a transformation). On the right-hand side $H_m$ is also non-square, it is the $(m+1)\times m$
upper left corner of the full $H$ matrix. The whole equation looks like this;

$M\cdot (\mathbf{v}_1, \, \mathbf{v}_2, \, \cdots, \, \mathbf{v}_m)=(\mathbf{v}_1, \, \mathbf{v}_2, \, \cdots, \, \mathbf{v}_m, \, \mathbf{v}_{m+1})\cdot \left(\begin{array}{ccc} H_{1,1} & \cdots & H_{1,m}\\ H_{2,1} & \cdots & H_{2,m}\\ & \ddots & \vdots\\0 &\cdots & H_{m+1,m}\end{array}\right)$

Solve this for the vector $\mathbf{v}_{m+1}$ ;

$\mathbf{v}_{m+1}=(M \mathbf{v}_{m}-H_{1,m} \mathbf{v}_{1}-\cdots H_{m,m} \mathbf{v}_{m}\Big)/H_{m+1,m}$

Does this look familiar; it says that $\mathbf{v}_{m+1}$ is $M\cdot\mathbf{v}_{m}$ subjected to Gram-Schmidt, made perpendicular to all $\mathbf{v}_{k}$ for $k<m+1$ then normalized because these are columns of orthogonal or unitary matrices $O$ . This is Arnoldi iteration, we simply Gram-Schmidt the Krylov vectors, starting with some initial $\mathbf{v}_1$ which we can choose arbitrarily. This whole process is inexpensive because we are working with vectors, and a relatively small collection of them since $m<<N$ . The entire process looks very much like $QR$ decomposition, with $Q\rightleftharpoons O$ and $R\rightleftharpoons H$ , except that the overall cost is lower.

Let’s do this in julia…

M=[1 2 3 4 5;2 3 4 5 6;3 4 5 6 7;4 5 6 7 8;5 6 7 8 9]
5×5 Matrix{Int64}:
 1  2  3  4  5
 2  3  4  5  6
 3  4  5  6  7
 4  5  6  7  8
 5  6  7  8  9


for j in 1:m
       tmp=M*O[:,j]
       for k in 1:j
       H[k,j]=(O[:,k])'*tmp
       tmp=tmp-O[:,k]*H[k,j]
       end
       H[j+1,j]=sqrt(tmp'*tmp)
       O[:,j+1]=tmp/H[j+1,j]
       end

Arnoldi iteration is the process of computing $M^k\cdot\mathbf{v}$ for some random vector $\mathbf{v}$ , for $k=0,1,2,\cdots, m$ , and orthogonalizing these vectors via Gram-Schmidt. So what does this get us?
Note that if we remove the last row of $H_m$ we get a square matrix which is a Hessianberg matrix similar to a chunk of $M$ .

The eigenvalues of $H_m$ sans row $m+1$ are excellent approximations to the largest eigenvalues of $M$ .

Because $H_m$ is quite small compared to $M$ and has the advantage of being Hessianberg, application of the $QR$ algorithm to $H_m$ will converge rapidly. For example the eigenvalues of the matrix $M$ used in the code above are $26.861, 1.8614, 0,0,0$ , trim away the last row of $H$ for our $m=3$ example

HM=H[1:3,:]
3×3 Matrix{Float64}:
 125.0    -54.6472    -19.715
 224.165  -98.01      -35.2146
   0.0      0.140718   -1.99005

using LinearAlgebra
eigvals(HM)
3-element Vector{Float64}:
 -1.8614066163451404
  1.171336260510704e-13
 26.86140661634499

igvals(M)
5-element Vector{Float64}:
 -1.8614066163450715
 -1.4886697450382702e-15
  9.132579802232772e-18
  2.1028713993215595e-16
 26.86140661634507

and we have excellent agreement with the two largest eigenvalues of $M$ . These estimates are called the Ritz values. The Ritz vectors converge rather quickly (as $m$ increases) on the extremal eigenvalues of $M$ , provided that these eigenvalues are isolated (isolated in the sense that the extremal eigenvalues are not too close together). Hessianberg matrices and especially tridiagonal matrices are every easy to do eigenvalue extractions on using the QR algorithm, only a few cycles are needed. A preliminary step in the eigenvalue business is to reduce a matrix to Hessianberg or tridiagonal form, then turn it over to QR. The Arnoldi iteration is very efficient in this preliminary step if one is only interested in extremal eigenvalues.

Home 2.0

Algorithms: Krylov subspaces, Arnoldi iteration and extreme eigenvalues

Related Posts

Algorithms: Krylov subspaces, Arnoldi iteration and extreme eigenvalues

Published by Jeff on January 5, 2022January 5, 2022

Related Posts

Linear equation solvers by LU decomposition

Algorithms: FFT-III. Julia FFTW and FITS file libraries

Eigenvalue extraction with QR decomposition