The most powerful iterative solvers for linear equations and eigenvalues are based in part on the types of operations performed in the Lanczos algorithm. Once a matrix has been transformed by similarity transformations into a triangular form
the whole game is over; eigenvalues and inverses can be extracted easily (we did so in an earlier post). Unfortunately it is often not possible to get an arbitrary into triangular form by similarity transformations, even if many sequential transformations are used. This is discussed in our section on Jordan normal forms. However any matrix can be put into Hessenberg form by something much like a similarity transformation.
is lower-Hessianberg if for all , and is upper-Hessianberg if for all . We will discuss Householder transformations in another post which provide you with a way of constructing orthogonal transformations that put into Hessianberg form.
Arnoldi iteration uses Kyrlov subspaces to transform into an upper-Hessianberg form . Given an matrix , start with a random vector , and create the vectors . If the set of vectors is subjected to a Gram-Schmidt orthogonalization, the resulting unit vectors form a basis for the Krylov subspace , which may be a subspace of because typically .
Upper-Hessianberg matrices have properties that make inversion and eigenvalue extraction simple. Given an arbitrary, possibly complex () with potentially very large, the direct approach of finding unitary such that where is upper-Hessianberg (full factorization ) might be out of the question, so instead one tries to work with the leftmost columns where , then in the factorization
the matrix will be , being of the form where each vector is an -vector (after all on the left acts on -vectors as a transformation). On the right-hand side is also non-square, it is the
upper left corner of the full matrix. The whole equation looks like this;
Solve this for the vector ;
Does this look familiar; it says that is subjected to Gram-Schmidt, made perpendicular to all for then normalized because these are columns of orthogonal or unitary matrices . This is Arnoldi iteration, we simply Gram-Schmidt the Krylov vectors, starting with some initial which we can choose arbitrarily. This whole process is inexpensive because we are working with vectors, and a relatively small collection of them since . The entire process looks very much like decomposition, with and , except that the overall cost is lower.
Let’s do this in julia…
M=[1 2 3 4 5;2 3 4 5 6;3 4 5 6 7;4 5 6 7 8;5 6 7 8 9]
5×5 Matrix{Int64}:
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
for j in 1:m
tmp=M*O[:,j]
for k in 1:j
H[k,j]=(O[:,k])'*tmp
tmp=tmp-O[:,k]*H[k,j]
end
H[j+1,j]=sqrt(tmp'*tmp)
O[:,j+1]=tmp/H[j+1,j]
end
Arnoldi iteration is the process of computing for some random vector , for , and orthogonalizing these vectors via Gram-Schmidt. So what does this get us?
Note that if we remove the last row of we get a square matrix which is a Hessianberg matrix similar to a chunk of .
The eigenvalues of sans row are excellent approximations to the largest eigenvalues of .
Because is quite small compared to and has the advantage of being Hessianberg, application of the algorithm to will converge rapidly. For example the eigenvalues of the matrix used in the code above are , trim away the last row of for our example
HM=H[1:3,:]
3×3 Matrix{Float64}:
125.0 -54.6472 -19.715
224.165 -98.01 -35.2146
0.0 0.140718 -1.99005
using LinearAlgebra
eigvals(HM)
3-element Vector{Float64}:
-1.8614066163451404
1.171336260510704e-13
26.86140661634499
igvals(M)
5-element Vector{Float64}:
-1.8614066163450715
-1.4886697450382702e-15
9.132579802232772e-18
2.1028713993215595e-16
26.86140661634507
and we have excellent agreement with the two largest eigenvalues of . These estimates are called the Ritz values. The Ritz vectors converge rather quickly (as increases) on the extremal eigenvalues of , provided that these eigenvalues are isolated (isolated in the sense that the extremal eigenvalues are not too close together). Hessianberg matrices and especially tridiagonal matrices are every easy to do eigenvalue extractions on using the QR algorithm, only a few cycles are needed. A preliminary step in the eigenvalue business is to reduce a matrix to Hessianberg or tridiagonal form, then turn it over to QR. The Arnoldi iteration is very efficient in this preliminary step if one is only interested in extremal eigenvalues.