Basic Fix Point Iteration

This exercise is about using Basic Fix Point Iteration and its variants on a Linear Eigenvalue Problem \[ \mathbf{A} \mathbf{x} = \lambda \mathbf{x} \] and to find the highest eigenvalue, the lowest eigenvalue and a specific eigenvalue closest to a constant \(\lambda \).

Problem

Consider the matrix \(\mathbf{A}\)

\[ \mathbf{A} = \begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 5 & 6 & 7 \\ 1 & 1 & 8 & 9 \\ 1 & 1 & 1 & 0 \end{pmatrix} \] with approximate eigenvalues \(\lambda \) \[\lambda_1 \approx 11.8789 \] \[\lambda_2 \approx 2.8673 \] \[\lambda_3 \approx - 1.4668 \] \[\lambda_4 \approx 0.7206 \] Implemen Fix Point Iteration (Power Integration), Inverse iteration and Shift-and-invert to find the largest, smallest and the negative eigenvalue.

Using one of these methods find an eigenvalue of the matrix \(\mathbf{B}\) \[ \mathbf{B} = \frac{1}{5} \left(\begin{matrix} 3 & 4 & 0 \\ -4 & 3 & 0 \\ 0 & 0 & 5 \end{matrix} \right) \]

Theory

Fix Point Iteration (Power Iteration)

If we apply the matrix \(\mathbf{A} \) to an arbitrary vector \(\mathbf{y}\), what happens?

First we can expand the vector in eigenbasis \[ \mathbf{y} = \sum_{n} \alpha_n \mathbf{x}_n \] and if applying to the matrix \(\mathbf{A} \) \[ \mathbf{A} \mathbf{y} = \sum_{n} \alpha_n \mathbf{A} \mathbf{x}_n = \sum_{n} \alpha_n \lambda_n \mathbf{x}_n \] So applying a matrix to an arbitrary vector amplifies the eigenvectors with large eigenvalues that contribute to the vector.

The algorithm becomes:

Pick an arbitrary normalized vector \(\mathbf{y}\), which means \(\vert \vert \mathbf{y} \vert \vert = 1\)
Calculate \( \mathbf{x} = \mathbf{A} \mathbf{y} \) and normalize it, so \(\frac{\mathbf{x}}{\vert \vert \mathbf{x} \vert \vert} \rightarrow \mathbf{x} \)
If the change \( \vert \vert \mathbf{x} - \mathbf{y} \vert \vert \) is sufficiently small, then \(\mathbf{x}\) is an approximate eigenvector and terminate (other termination conditions possible). Otherwise set \(\mathbf{y} = \mathbf{x}\) and go back to step 2.
From the eigenvector find the corresponding eigenvalue \(\lambda \) via the Rayleigh quotient, which is the fraction of two inner products \[ \lambda = \frac{\langle \mathbf{y} , \mathbf{A} \mathbf{y} \rangle}{\langle \mathbf{y} , \mathbf{y} \rangle} \]

Fix point iteration will find the eigenvalue with the greatest norm that was contained in the original starting vector.

Inverse Power Iteration

If we wish to change the eigenvalues but not the eigenvectors of a matrix (so-called spectral transformations), we can apply an matrix inversion \[\mathbf{A} \mathbf{x} = \lambda \mathbf{x} \ \ \rightarrow \ \ \mathbf{A}^{-1} \mathbf{x} = \lambda^{-1} \mathbf{x} \] So we apply the fix point iteration to the inverse matrix \(\tilde{\mathbf{A}} = \mathbf{A}^{-1} \).

So that means, that the largest eigenvalue \(\tilde{\lambda} \) of \(\mathbf{\tilde{A}} \) is always the smallest eigenvalue

Shift-and-Invert (Shifted Inverse Iteration)

Another spectral transformation is adding a constant \(\mu \) to the diagonal: \[\mathbf{A} \mathbf{x} = \lambda \mathbf{x} \ \ \rightarrow \ \ \left(\mathbf{A} + \mu \mathbf{I} \right) \mathbf{x} = \mathbf{A} \mathbf{x} + \mu \mathbf{x} = \lambda \mathbf{x} + \mu \mathbf{x} = \left(\lambda + \mu \right) \mathbf{x} \] We can combine the spectral transformation by adding the constant \(\mu \) to the diagonal and then invert that matrix. \[ \tilde{\mathbf{A}} = \left(\mathbf{A} - \mu \mathbf{I} \right)^{-1} \] The largest eigenvalue \(\tilde{\lambda} \) of \(\mathbf{\tilde{A}} \) is the eigenvalue \(\lambda = \mu + \tilde{\lambda}^{-1} \) of \(\mathbf{A} \) that is closest to \(\mu \)

Implementation

This implementation involves three iterative methods for finding eigenvalues and eigenvectors of a matrix \(\mathbf{A}\)

Fix Point Iteration (Power Iteration): To find the highest eigenvalue
Inverse Power Iteration: To find the smallest eigenvalue
Shift-and-Invert (Shifted Inverse Iteration): To find the eigenvalue closest to a constant \(\mu \)

The theories and algorithms for these 3 methods are described in the Theory section.

The different considerations for all the algorithms are:

Matrix Initialization: e.g. for Python, we can initialize the matrix \(\mathbf{A}\) using NumPy's np.array function
Random Initial Vector: We have to initialize a random vector \(\mathbf{y}\). And it has to be normalized before starting the iterations to ensure numerical stability.
Normalization: In each iteration, the vector \(\mathbf{y}\) is normalized to prevent numerical overflow or underflow, which ensures the algorithm's stability.
Eigenvalue Calculation: To calculate the eigenvalue \(\lambda\), we calculate the Rayleigh quotient (see Theory Section), This is a fraction between two inner products. The equation can be written in a simpler way (more on that later).
Iteration and Convergence: You can decide between two approaches. Either by iterate for a fixed number and see if it converges or make a condition for a given change in the eigenvalue (for the most precise, use machine precision) and if met, then terminate.

The specific considerations for the Inverse Power Iteration and Shift-and-Invert are

Matrix Inversion (Inserve Iteration): For the inverse iteration, the inverse of the matrix \(\mathbf{A}\) is calculated using a library, e.g. in Python using np.linalg.inv. Ensure that the matrix \(\mathbf{A}\) is non-singular (i.e., it has an inverse).
Matrix Inversion (Shift-and-Invert): For the shift-and-invert iteration, the matrix \( \left(\mathbf{A} - \mu \mathbf{I}\right) \) is inverted. The shift parameter \(\mu\) should be chosen, such that \( \left(\mathbf{A} - \mu \mathbf{I}\right) \) is invertible.

What I did:

Matrix and Random Vector Inilization: For the matrix \(\mathbf{A}\) just write it (for Python: np.array() ) and for the random vector \(\mathbf{y}\) use some kind of library for the random vector (for Python: np.random.rand() ).
Normalization: Use som kind of library for normalization of vectors (for Python: np.linalg.norm() )
Eigenvalue Calculation: The Rayleigh quotient for the calculation of the eigenvalue \(\lambda \) was originally stated in the Theory Section to be: \[ \lambda = \frac{\langle \mathbf{y} , \mathbf{A} \mathbf{y} \rangle}{\langle \mathbf{y} , \mathbf{y} \rangle} \] But inner product \(\langle \mathbf{A} , \mathbf{B} \rangle \) can also be written as the transpose of the first vector/matrix dotted with the second vector/matrix \(\mathbf{A}^{\intercal} \mathbf{B}\).
So using this we can rewrite the Rayleigh quotient:
\[ \lambda = \frac{\mathbf{y}^{\intercal} \mathbf{A} \mathbf{y}}{\mathbf{y}^{\intercal} \mathbf{y}} \]
Iteration and Convergence: For this simple exercise, just use a fixed number of iterations, since it converges fast.
Matrix Inversion and the shift \(\mu\): The matrix inversion is done by using a library (for Python np.linalg.inv() ) and the shift \(\mu\) must be chosen wisely. We have to look at the eigenvalues again. The shift-and-invert has to find the negative eigenvalue, so we have to choose the shift \(\mu \) to be closest to \(\lambda_4 \approx - 1.4668 \).

Results

Implementing the three iterations methods (variations os Fix Point Iterations):

Fix Point Iteration (Power Iteration): To find the highest eigenvalue
Inverse Power Iteration: To find the smallest eigenvalue
Shift-and-Invert (Shifted Inverse Iteration): To find the eigenvalue closest to a constant \(\mu \)

For the matrix \(\mathbf{A}\) \[ \mathbf{A} = \begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 5 & 6 & 7 \\ 1 & 1 & 8 & 9 \\ 1 & 1 & 1 & 0 \end{pmatrix} \] with approximate eigenvalues \(\lambda \) \[\lambda_1 \approx 11.8789 \] \[\lambda_2 \approx 2.8673 \] \[\lambda_3 \approx - 1.4668 \] \[\lambda_4 \approx 0.7206 \] The convergence plot (with 10 iterations) showing the three different iteration methods are shown in Figure 1.

The results from the command line are shown in Figure 2 and corresponds with the given values of \(\lambda \)

Figure 1: Plot of basic fix point iterations.

Figure 2: Results of basic fix point iterations.

For the matrix \(\mathbf{B}\) \[ \mathbf{B} = \frac{1}{5} \left(\begin{matrix} 3 & 4 & 0 \\ -4 & 3 & 0 \\ 0 & 0 & 5 \end{matrix} \right) \]

The convergence plot (with 10 iterations) showing the three different iteration methods are shown in Figure 3. The results from the command line are shown in Figure 4

Code

Show/Hide Python Code

import numpy as np
import matplotlib.pyplot as plt

# Clear all variables
A = np.array([
    [1, 2, 3, 4],
    [1, 5, 6, 7],
    [1, 1, 8, 9],
    [1, 1, 1, 0]
])
# A = np.array([[3, 4, 0], [-4, 3, 0], [0, 0, 1]]) / 5

iter = 10

N = A.shape[0]

# Fix point iteration:
y = np.random.rand(N)
lA = []
change = []
for _ in range(iter):
    y = y / np.linalg.norm(y)
    x = A @ y
    lambda_ = (y.T @ (A @ y)) / (y.T @ y)
    lA.append(lambda_)
    change.append(np.linalg.norm(y - x / np.linalg.norm(x)))
    y = x
print((y.T @ (A @ y)) / (y.T @ y))

# Inverse iteration:
B = np.linalg.inv(A)
y = np.random.rand(N)
lB = []
for _ in range(iter):
    y = y / np.linalg.norm(y)
    x = B @ y
    lambda_ = (y.T @ (A @ y)) / (y.T @ y)
    lB.append(lambda_)
    y = x
print((y.T @ (A @ y)) / (y.T @ y))

# Shift-and-invert:
mu = -1.4
# mu = 1j
C = np.linalg.inv(A - mu * np.eye(N))
y = np.random.rand(N)
lC = []
for _ in range(iter):
    y = y / np.linalg.norm(y)
    x = C @ y
    lambda_ = (y.T @ (A @ y)) / (y.T @ y)
    lC.append(lambda_)
    y = x
print((y.T @ (A @ y)) / (y.T @ y))

# Plotting
plt.figure()
# plt.plot(change)
plt.plot(lA, 'ko-', label='Fix Point Iteration')
plt.plot(lB, 'bo-', label='Inverse Iteration')
plt.plot(lC, 'ro-', label='Shift-and-Invert Iteration')
plt.legend()
plt.show()

Show/Hide MATLAB Code

clear all;
close all;

A = [1 2 3 4; 1 5 6 7; 1 1 8 9; 1 1 1 0];
%A = [3 4 0; -4 3 0; 0 0 1] / 5;
iter = 100;

N = size(A, 1);

% fix point iteration:
y = rand(N, 1);
lA = [];
change = [];
for ii = 1:iter
  y = y / norm(y);
  x = A * y;
  lambda = y' * (A * y) / (y' * y);
  lA = [lA; lambda];
  change = [change, norm(y - x/norm(x))];
  y = x;
end
y' * (A * y) / (y' * y)


% inverse iteration:
B = inv(A);
y = rand(N, 1);
lB = [];
for ii = 1:iter
  y = y / norm(y);
  x = B * y;
  lambda = y' * (A * y) / (y' * y);
  lB = [lB; lambda];
  y = x;
end
y' * (A * y) / (y' * y)


% shift-and-invert:
mu = -1.4;
%mu = 1i;
C = inv(A - mu * eye(N));
y = rand(N, 1);
lC = [];
for ii = 1:iter
  y = y / norm(y);
  x = C * y;
  lambda = y' * (A * y) / (y' * y);
  lC = [lC; lambda];
  y = x;
end
y' * (A * y) / (y' * y)

hold on;
%plot(change);
plot(lA, 'ko-')
plot(lB, 'bo-')
plot(lC, 'ro-')

Show/Hide C++ Code

#include < iostream >
#include < Eigen/Dense >
#include < vector >
#include "matplotlibcpp.h"

namespace plt = matplotlibcpp;
using namespace Eigen;
using namespace std;

int main() {
    // Define the matrix A
    MatrixXd A(4, 4);
    A << 1, 2, 3, 4,
         1, 5, 6, 7,
         1, 1, 8, 9,
         1, 1, 1, 0;

    int iter = 100;
    int N = A.rows();

    // Fix point iteration
    VectorXd y = VectorXd::Random(N);
    vector lA;
    vector change;

    for (int i = 0; i < iter; ++i) {
        y.normalize();
        VectorXd x = A * y;
        double lambda = (y.transpose() * A * y) / (y.transpose() * y);
        lA.push_back(lambda);
        change.push_back((y - x / x.norm()).norm());
        y = x;
    }
    cout << "Fix Point Iteration Eigenvalue: " << (y.transpose() * A * y) / (y.transpose() * y) << endl;

    // Inverse iteration
    MatrixXd B = A.inverse();
    y = VectorXd::Random(N);
    vector lB;

    for (int i = 0; i < iter; ++i) {
        y.normalize();
        VectorXd x = B * y;
        double lambda = (y.transpose() * A * y) / (y.transpose() * y);
        lB.push_back(lambda);
        y = x;
    }
    cout << "Inverse Iteration Eigenvalue: " << (y.transpose() * A * y) / (y.transpose() * y) << endl;

    // Shift-and-invert
    double mu = -1.4;
    MatrixXd C = (A - mu * MatrixXd::Identity(N, N)).inverse();
    y = VectorXd::Random(N);
    vector lC;

    for (int i = 0; i < iter; ++i) {
        y.normalize();
        VectorXd x = C * y;
        double lambda = (y.transpose() * A * y) / (y.transpose() * y);
        lC.push_back(lambda);
        y = x;
    }
    cout << "Shift-and-Invert Iteration Eigenvalue: " << (y.transpose() * A * y) / (y.transpose() * y) << endl;

    // Plotting
    plt::figure();
    plt::plot(lA, "ko-", {{"label", "Fix Point Iteration"}});
    plt::plot(lB, "bo-", {{"label", "Inverse Iteration"}});
    plt::plot(lC, "ro-", {{"label", "Shift-and-Invert Iteration"}});
    plt::legend();
    plt::show();

    return 0;
}

Basic Fix Point Iteration

Intro

Fix Point Iteration (Power Iteration)

Inverse Power Iteration

Shift-and-Invert (Shifted Inverse Iteration)

Code