逆矩陣沒有想像中的好

Written byKalanKalan
💡

If you have any questions or feedback, pleasefill out this form

This post is translated by ChatGPT and originally written in Mandarin, so there may be some inaccuracies or mistakes.

Today, let's talk about a term that many are familiar with: the inverse matrix. We'll discuss why the inverse matrix isn't as beneficial as it might seem and why we should avoid using it in practical calculations.

Solving Linear Equations

Ax=bAx=b

In the previous article, we mentioned that we can directly compute the inverse of matrix A and then find x by calculating A1bA^{-1}b.

While this sounds straightforward, in practical applications, directly calculating the inverse matrix to solve linear equations is quite rare. In real-world scenarios, the elements of matrices are seldom all integers; they are often filled with various floating-point numbers and have much larger dimensions. Unlike the problems encountered in university, which typically involve at most 3x3 matrices with well-designed numbers that rarely involve decimals.

Inverse Matrix

There are two classic methods for finding the inverse of a matrix:

  • Using the adjugate matrix (Adj Matrix) and determinant to find the inverse:
A1=adj(A)/det(A)A^{-1}=adj(A) / det(A)
  • Using the Gauss-Jordan elimination method to compute the inverse matrix.

Intuitively, using the adjugate matrix and determinant to compute the inverse is much more cumbersome than using the Gauss-Jordan elimination method. In practice, the adjugate matrix and determinant indeed present challenges for computer calculations.

Due to the presence of

1det(A)\frac{1}{det(A)}

when the determinant approaches zero, this can cause values to become very large, leading to errors or even overflow. Additionally, the adjugate matrix requires breaking down matrix A into smaller matrices and calculating their determinants, which has a complexity of O(n!). So, despite the formula appearing simple, it is not easy to compute.

A more common method for finding the inverse matrix is to use the Gauss-Jordan elimination to simplify the augmented matrix:

[AI]\begin{bmatrix} A|I \end{bmatrix}

by performing row operations to convert it to an upper triangular matrix, then reducing all non-diagonal elements to zero. At this point, the augmented matrix will be:

[IB]\begin{bmatrix} I|B \end{bmatrix}

where B is the inverse of A. If we were to write this in Python (assuming the matrix has an inverse), it would look like this:

import numpy as np

def gauss_jordan(m):
    n = len(m)
    identity = np.eye(n)

    # Augmented matrix
    augmented = np.hstack((m, identity))

    for i in range(n):
        # checking if diagonal element is zero
        if augmented[i][i] == 0:
            # swapping with row below
            for j in range(i+1, n):
                if augmented[j][i] != 0:
                    augmented[[i, j]] = augmented[[j, i]]
                    break
        augmented[i] = augmented[i] / augmented[i, i]
        for j in range(0, n):
            if i != j:
                augmented[j] -= augmented[j, i] * augmented[i]

    # return the right side of the augmented matrix
    return augmented[:, n:]

m = np.array([[1., 2.],
              [3., 4.]])
inv_m = gauss_jordan(m)

The np.linalg.inv function in NumPy provides a built-in way to calculate the inverse, but here we created a simple version for demonstration purposes. From the perspective of time complexity, this can be seen as O(n^3).

LU Decomposition

In computer science, using the inverse matrix to solve linear equations is not a common practice. This stems from the problem-solving mindset cultivated during university, where one might wonder why we don't simply use the inverse matrix when we can go through the trouble of decomposing the matrix instead.

LU decomposition involves breaking matrix A down into two matrices, L and U, which are the lower triangular matrix and the upper triangular matrix, respectively. Here, L has ones on the diagonal, non-zero elements below the diagonal, and zeros above the diagonal; U has non-zero elements above the diagonal and zeros below the diagonal.

picture6

A=LUA = LU

Once we have calculated LU, we can proceed with:

Ax=bLUx=bLety=Ux,solveLy=by=UxtofindxAx = b \\ LUx = b \\ Let y= Ux, solve Ly=b \\ y = Ux to find x

In NumPy, this implementation would look something like:

import numpy as np

def lu_decomposition(A):
    n = len(A)
    L = np.zeros((n, n))
    U = np.zeros((n, n))

    for i in range(n):
        L[i][i] = 1  # Diagonal elements of L are 1

        for j in range(i, n):
            # Compute the upper triangular matrix U
            summation = sum(L[i][k] * U[k][j] for k in range(i))
            U[i][j] = A[i][j] - summation

        for j in range(i + 1, n):
            # Compute the lower triangular matrix L
            summation = sum(L[j][k] * U[k][i] for k in range(i))
            L[j][i] = (A[j][i] - summation) / U[i][i]

    return L, U

def lu_solve(L, U, b):
    n = len(L)
    y = np.zeros(n)
    x = np.zeros(n)

    # Solving L * y = b
    for i in range(n):
        summation = sum(L[i][j] * y[j] for j in range(i))
        y[i] = (b[i] - summation) / L[i][i]

    # Solving U * x = y
    for i in range(n - 1, -1, -1):
        summation = sum(U[i][j] * x[j] for j in range(i + 1, n))
        x[i] = (y[i] - summation) / U[i][i]

    return x

A = np.array([[1, 1, 1], [2, 3, 1], [7,4,2]])
b = np.array([3, 6, 13])

L, U = lu_decomposition(A)
x = lu_solve(L, U, b)

When comparing LU decomposition to the Gauss-Jordan elimination method, the computational complexity is about:

O(n33)O(\frac{n^3}{3})

which is three times more efficient than the O(n^3) of the Gauss-Jordan elimination.

It may not be intuitive that even though LU decomposition adds steps for L * y and U * x, which seem more complicated, it still outperforms the inverse matrix method. This is due to the properties of the LU matrices, where the calculations for Ly and Ux require approximately O(n^2) operations.

Conclusion

In terms of computational load, there isn't a vast difference between using the Gauss-Jordan elimination method to compute the inverse matrix and using LU decomposition. However, in practical applications, the equations we often need to solve are typically sparse matrices, which means that a significant portion of the matrix elements are zero.

For this type of problem, if we directly calculate the inverse matrix, the resulting inverse matrix will not only be quite large but also contain many non-zero elements. In contrast, using LU decomposition allows most of the zero elements from matrix A to be preserved in the LU matrices, simplifying the calculations.

Postscript

Many seemingly intuitive solutions are not necessarily the best ones. Finding the optimal solution often involves "looking at the world from a different perspective." For example, methods like the Fast Fourier Transform and Discrete Cosine Transform may initially appear convoluted, yet they are clever strategies for solving problems.

If you found this article helpful, please consider buying me a coffee ☕ It'll make my ordinary day shine ✨

Buy me a coffee