Basic Fix Point Iteration

Intro

This exercise is about using Basic Fix Point Iteration and its variants on a Linear Eigenvalue Problem \[ \mathbf{A} \mathbf{x} = \lambda \mathbf{x} \] and to find the highest eigenvalue, the lowest eigenvalue and a specific eigenvalue closest to a constant \(\lambda \).

Problem

Consider the matrix \(\mathbf{A}\)

\[ \mathbf{A} = \begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 5 & 6 & 7 \\ 1 & 1 & 8 & 9 \\ 1 & 1 & 1 & 0 \end{pmatrix} \] with approximate eigenvalues \(\lambda \) \[\lambda_1 \approx 11.8789 \] \[\lambda_2 \approx 2.8673 \] \[\lambda_3 \approx - 1.4668 \] \[\lambda_4 \approx 0.7206 \] Implemen Fix Point Iteration (Power Integration), Inverse iteration and Shift-and-invert to find the largest, smallest and the negative eigenvalue.

Using one of these methods find an eigenvalue of the matrix \(\mathbf{B}\) \[ \mathbf{B} = \frac{1}{5} \left(\begin{matrix} 3 & 4 & 0 \\ -4 & 3 & 0 \\ 0 & 0 & 5 \end{matrix} \right) \]

Theory

Fix Point Iteration (Power Iteration)

If we apply the matrix \(\mathbf{A} \) to an arbitrary vector \(\mathbf{y}\), what happens?

First we can expand the vector in eigenbasis \[ \mathbf{y} = \sum_{n} \alpha_n \mathbf{x}_n \] and if applying to the matrix \(\mathbf{A} \) \[ \mathbf{A} \mathbf{y} = \sum_{n} \alpha_n \mathbf{A} \mathbf{x}_n = \sum_{n} \alpha_n \lambda_n \mathbf{x}_n \] So applying a matrix to an arbitrary vector amplifies the eigenvectors with large eigenvalues that contribute to the vector.

The algorithm becomes:

  1. Pick an arbitrary normalized vector \(\mathbf{y}\), which means \(\vert \vert \mathbf{y} \vert \vert = 1\)
  2. Calculate \( \mathbf{x} = \mathbf{A} \mathbf{y} \) and normalize it, so \(\frac{\mathbf{x}}{\vert \vert \mathbf{x} \vert \vert} \rightarrow \mathbf{x} \)
  3. If the change \( \vert \vert \mathbf{x} - \mathbf{y} \vert \vert \) is sufficiently small, then \(\mathbf{x}\) is an approximate eigenvector and terminate (other termination conditions possible). Otherwise set \(\mathbf{y} = \mathbf{x}\) and go back to step 2.
  4. From the eigenvector find the corresponding eigenvalue \(\lambda \) via the Rayleigh quotient, which is the fraction of two inner products \[ \lambda = \frac{\langle \mathbf{y} , \mathbf{A} \mathbf{y} \rangle}{\langle \mathbf{y} , \mathbf{y} \rangle} \]
Fix point iteration will find the eigenvalue with the greatest norm that was contained in the original starting vector.

Inverse Power Iteration

If we wish to change the eigenvalues but not the eigenvectors of a matrix (so-called spectral transformations), we can apply an matrix inversion \[\mathbf{A} \mathbf{x} = \lambda \mathbf{x} \ \ \rightarrow \ \ \mathbf{A}^{-1} \mathbf{x} = \lambda^{-1} \mathbf{x} \] So we apply the fix point iteration to the inverse matrix \(\tilde{\mathbf{A}} = \mathbf{A}^{-1} \).

So that means, that the largest eigenvalue \(\tilde{\lambda} \) of \(\mathbf{\tilde{A}} \) is always the smallest eigenvalue

Shift-and-Invert (Shifted Inverse Iteration)

Another spectral transformation is adding a constant \(\mu \) to the diagonal: \[\mathbf{A} \mathbf{x} = \lambda \mathbf{x} \ \ \rightarrow \ \ \left(\mathbf{A} + \mu \mathbf{I} \right) \mathbf{x} = \mathbf{A} \mathbf{x} + \mu \mathbf{x} = \lambda \mathbf{x} + \mu \mathbf{x} = \left(\lambda + \mu \right) \mathbf{x} \] We can combine the spectral transformation by adding the constant \(\mu \) to the diagonal and then invert that matrix. \[ \tilde{\mathbf{A}} = \left(\mathbf{A} - \mu \mathbf{I} \right)^{-1} \] The largest eigenvalue \(\tilde{\lambda} \) of \(\mathbf{\tilde{A}} \) is the eigenvalue \(\lambda = \mu + \tilde{\lambda}^{-1} \) of \(\mathbf{A} \) that is closest to \(\mu \)

Implementation

This implementation involves three iterative methods for finding eigenvalues and eigenvectors of a matrix \(\mathbf{A}\)

  1. Fix Point Iteration (Power Iteration): To find the highest eigenvalue
  2. Inverse Power Iteration: To find the smallest eigenvalue
  3. Shift-and-Invert (Shifted Inverse Iteration): To find the eigenvalue closest to a constant \(\mu \)
The theories and algorithms for these 3 methods are described in the Theory section.

The different considerations for all the algorithms are:

The specific considerations for the Inverse Power Iteration and Shift-and-Invert are

What I did:

Results

Implementing the three iterations methods (variations os Fix Point Iterations):

  1. Fix Point Iteration (Power Iteration): To find the highest eigenvalue
  2. Inverse Power Iteration: To find the smallest eigenvalue
  3. Shift-and-Invert (Shifted Inverse Iteration): To find the eigenvalue closest to a constant \(\mu \)
For the matrix \(\mathbf{A}\) \[ \mathbf{A} = \begin{pmatrix} 1 & 2 & 3 & 4 \\ 1 & 5 & 6 & 7 \\ 1 & 1 & 8 & 9 \\ 1 & 1 & 1 & 0 \end{pmatrix} \] with approximate eigenvalues \(\lambda \) \[\lambda_1 \approx 11.8789 \] \[\lambda_2 \approx 2.8673 \] \[\lambda_3 \approx - 1.4668 \] \[\lambda_4 \approx 0.7206 \] The convergence plot (with 10 iterations) showing the three different iteration methods are shown in Figure 1.

The results from the command line are shown in Figure 2 and corresponds with the given values of \(\lambda \)

Convergence of Eigenvalue Iteration Methods (Matrix A)
Plot of basic fix point iterations
Figure 1: Plot of basic fix point iterations.
Calculated Eigenvalues of Matrix A
Results of basic fix point iterations
Figure 2: Results of basic fix point iterations.
For the matrix \(\mathbf{B}\) \[ \mathbf{B} = \frac{1}{5} \left(\begin{matrix} 3 & 4 & 0 \\ -4 & 3 & 0 \\ 0 & 0 & 5 \end{matrix} \right) \]

The convergence plot (with 10 iterations) showing the three different iteration methods are shown in Figure 3. The results from the command line are shown in Figure 4

Convergence of Eigenvalue Iteration Methods (Matrix B)
Plot of basic fix point iterations
Figure 3: Plot of basic fix point iterations.
Calculated Eigenvalues of Matrix B
Results of basic fix point iterations
Figure 4: Results of basic fix point iterations.

Code

Show/Hide Python Code
import numpy as np
import matplotlib.pyplot as plt

# Clear all variables
A = np.array([
    [1, 2, 3, 4],
    [1, 5, 6, 7],
    [1, 1, 8, 9],
    [1, 1, 1, 0]
])
# A = np.array([[3, 4, 0], [-4, 3, 0], [0, 0, 1]]) / 5

iter = 10

N = A.shape[0]

# Fix point iteration:
y = np.random.rand(N)
lA = []
change = []
for _ in range(iter):
    y = y / np.linalg.norm(y)
    x = A @ y
    lambda_ = (y.T @ (A @ y)) / (y.T @ y)
    lA.append(lambda_)
    change.append(np.linalg.norm(y - x / np.linalg.norm(x)))
    y = x
print((y.T @ (A @ y)) / (y.T @ y))

# Inverse iteration:
B = np.linalg.inv(A)
y = np.random.rand(N)
lB = []
for _ in range(iter):
    y = y / np.linalg.norm(y)
    x = B @ y
    lambda_ = (y.T @ (A @ y)) / (y.T @ y)
    lB.append(lambda_)
    y = x
print((y.T @ (A @ y)) / (y.T @ y))

# Shift-and-invert:
mu = -1.4
# mu = 1j
C = np.linalg.inv(A - mu * np.eye(N))
y = np.random.rand(N)
lC = []
for _ in range(iter):
    y = y / np.linalg.norm(y)
    x = C @ y
    lambda_ = (y.T @ (A @ y)) / (y.T @ y)
    lC.append(lambda_)
    y = x
print((y.T @ (A @ y)) / (y.T @ y))

# Plotting
plt.figure()
# plt.plot(change)
plt.plot(lA, 'ko-', label='Fix Point Iteration')
plt.plot(lB, 'bo-', label='Inverse Iteration')
plt.plot(lC, 'ro-', label='Shift-and-Invert Iteration')
plt.legend()
plt.show()
    
Show/Hide MATLAB Code
clear all;
close all;

A = [1 2 3 4; 1 5 6 7; 1 1 8 9; 1 1 1 0];
%A = [3 4 0; -4 3 0; 0 0 1] / 5;
iter = 100;

N = size(A, 1);

% fix point iteration:
y = rand(N, 1);
lA = [];
change = [];
for ii = 1:iter
  y = y / norm(y);
  x = A * y;
  lambda = y' * (A * y) / (y' * y);
  lA = [lA; lambda];
  change = [change, norm(y - x/norm(x))];
  y = x;
end
y' * (A * y) / (y' * y)


% inverse iteration:
B = inv(A);
y = rand(N, 1);
lB = [];
for ii = 1:iter
  y = y / norm(y);
  x = B * y;
  lambda = y' * (A * y) / (y' * y);
  lB = [lB; lambda];
  y = x;
end
y' * (A * y) / (y' * y)


% shift-and-invert:
mu = -1.4;
%mu = 1i;
C = inv(A - mu * eye(N));
y = rand(N, 1);
lC = [];
for ii = 1:iter
  y = y / norm(y);
  x = C * y;
  lambda = y' * (A * y) / (y' * y);
  lC = [lC; lambda];
  y = x;
end
y' * (A * y) / (y' * y)

hold on;
%plot(change);
plot(lA, 'ko-')
plot(lB, 'bo-')
plot(lC, 'ro-')
            
Show/Hide C++ Code
#include < iostream >
#include < Eigen/Dense >
#include < vector >
#include "matplotlibcpp.h"

namespace plt = matplotlibcpp;
using namespace Eigen;
using namespace std;

int main() {
    // Define the matrix A
    MatrixXd A(4, 4);
    A << 1, 2, 3, 4,
         1, 5, 6, 7,
         1, 1, 8, 9,
         1, 1, 1, 0;

    int iter = 100;
    int N = A.rows();

    // Fix point iteration
    VectorXd y = VectorXd::Random(N);
    vector lA;
    vector change;

    for (int i = 0; i < iter; ++i) {
        y.normalize();
        VectorXd x = A * y;
        double lambda = (y.transpose() * A * y) / (y.transpose() * y);
        lA.push_back(lambda);
        change.push_back((y - x / x.norm()).norm());
        y = x;
    }
    cout << "Fix Point Iteration Eigenvalue: " << (y.transpose() * A * y) / (y.transpose() * y) << endl;

    // Inverse iteration
    MatrixXd B = A.inverse();
    y = VectorXd::Random(N);
    vector lB;

    for (int i = 0; i < iter; ++i) {
        y.normalize();
        VectorXd x = B * y;
        double lambda = (y.transpose() * A * y) / (y.transpose() * y);
        lB.push_back(lambda);
        y = x;
    }
    cout << "Inverse Iteration Eigenvalue: " << (y.transpose() * A * y) / (y.transpose() * y) << endl;

    // Shift-and-invert
    double mu = -1.4;
    MatrixXd C = (A - mu * MatrixXd::Identity(N, N)).inverse();
    y = VectorXd::Random(N);
    vector lC;

    for (int i = 0; i < iter; ++i) {
        y.normalize();
        VectorXd x = C * y;
        double lambda = (y.transpose() * A * y) / (y.transpose() * y);
        lC.push_back(lambda);
        y = x;
    }
    cout << "Shift-and-Invert Iteration Eigenvalue: " << (y.transpose() * A * y) / (y.transpose() * y) << endl;

    // Plotting
    plt::figure();
    plt::plot(lA, "ko-", {{"label", "Fix Point Iteration"}});
    plt::plot(lB, "bo-", {{"label", "Inverse Iteration"}});
    plt::plot(lC, "ro-", {{"label", "Shift-and-Invert Iteration"}});
    plt::legend();
    plt::show();

    return 0;
}