Lanczos algorithm
The Lanczos algorithm is a popular method to find a zero vector in the process of the quadratic sieve. It is supposed to be one of the most efficient ways of finding a zero vector, which is a crucial part of the Quadratic Sieve and Continued Fraction factoring algorithms.
Another Lanczos algorithm is an iterative algorithm that is an adaptation of power methods to find eigenvalues and eigenvector of a square matrix or the singular value decomposition of a rectangular matrix. It is particularly useful for finding decompositions of very large sparse matrices. In Latent Semantic Indexing, for instance, matrices relating millions of documents to hundreds of thousands of terms must be reduced to singular value form.
The power method for finding the largest eigenvalue of a matrix can be summarized by noting that if is a random vector and , then .
If is the eigenvalue decomposition of , then . As gets very large, the diagonal matrix of eigenvalues will be dominated by whichever eigenvalue is largest (neglecting the case where there are large equal eigenvalues, of course). As this happens, will converge to the largest eigenvalue and to the associated eigenvector. If the largest eigenvalue is multiple, then will converge to a vector in the subspace spanned by the eigenvectors associated with those largest eigenvalues. After you get the first eigenvector/value, you can successively restrict the algorithm to the null space of the known eigenvectors to get the other eigenvector/values.
In practice, this simple algorithm does not work very well for computing very many of the eigenvectors because any round-off error will tend to introduce slight components of the more significant eigenvectors back into the computation, degrading the accuracy of the computation. Pure power methods also can converge slowly, even for the first eigenvector.
Lanczos algorithms are modifications of the basic power algorithm in which each is restricted to be orthogonal to all the previous values. In the course of constructing these vectors, the normalizing constants used are assembled into a tridiagonal matrix whose most significant eigenvalues quickly converge to the eigenvalues of the original system.
Variations on the Lanczos algorithm exist where the vectors involved are tall, narrow matrices instead of vectors and the normalizing constants are small square matrices. These are called "block" Lanczos algorithms and can be much faster on computers with large numbers of registers and long memory fetch times.
Lanczos algorithms are very attractive because the multiplication by is the only large scale linear operation. Since weighted-term text retrieval engines implement just this operation, the Lanczos algorithm can be applied efficiently to text documents (see Latent Semantic Indexing). Eigenvectors are also important for the analysis of other large scale text retrieval methods such as the HITS algorithm from IBM, or the PageRank algorithm used at one time by Google.
See also
External links
- Golub and van Loan give very good descriptions of the various forms of Lanczos algorithms in their book Matrix Computations
- Andrew Ng et al., an analysis of PageRank
- Lanczos and conjugate gradient methods B. A. LaMacchia and A. M. Odlyzko, Solving Large Sparse Linear Systems Over Finite Fields