线性回归的N种姿势

程序员文章站 2022-05-22 13:06:14

...

1. Naive Solution

We want to find $\hat{x}$ that minimizes:

| | A \hat{x} - b | |_{2}

Another way to think about this is that we are interested in where vector

b ​

is closest to the subspace spanned by

A ​

(called the range of

A ​

A \hat{x} ​

is the projection of

b ​

onto

A ​

. Since

b - A \hat{x} ​

must be perpendicular to the subspace spanned by

A ​

, we see that

A^{T} (b - A \hat{x}) = 0

(we are using

A^{T}

because we want to multiply each column of

A

b - A \hat{x}

This leads us to the normal equations:

x = (A^{T} A)^{- 1} A^{T} b

但是这种方法不是很practical，因为要求矩阵的逆，比较麻烦。

def ls_naive(A, b):
    # not practical，因为直接求了矩阵的逆，很麻烦
     return np.linalg.inv(A.T @ A) @ A.T @ b

coeffs_naive = ls_naive(trn_int, y_trn)
regr_metrics(y_test, test_int @ coeffs_naive)

2. Cholesky Factorization

Normal equations:

A^{T} A x = A^{T} b

A

has full rank, the pseudo-inverse

(A^{T} A)^{- 1} A^{T}

is a square, hermitian positive definite matrix.

【注意】Cholesky一定要是正定矩阵！虽然这个操作又快又好，但是使用场景有限！

The standard way of solving such a system is Cholesky Factorization, which finds upper-triangular R such that $A^{T} A = R^{T} R$ . ( $R^{T}$ is lower-triangular )

AtA = A.T @ A
Atb = A.T @ b

R = scipy.linalg.cholesky(AtA)

# check our factorization
np.linalg.norm(AtA - R.T @ R)

Q：为啥变成 $R^{T} R x = A^{T} b$ 会变简单？

之前提过，上三角矩阵解线性方程组时会简单许多（最后一行只有一个未知数）。

所以先求 $R^{T} w = A^{T} b$ 再求 $R x = w$ 会easier。

A^{T} A x = A^{T} b R^{T} R x = A^{T} b R^{T} w = A^{T} b R x = w

def ls_chol(A, b):
    R = scipy.linalg.cholesky(A.T @ A)
    w = scipy.linalg.solve_triangular(R, A.T @ b, trans='T')
    return scipy.linalg.solve_triangular(R, w)

%timeit coeffs_chol = ls_chol(trn_int, y_trn)

3. QR Factorization

$Q$ is orthonormal, $R$ is upper-triangular.

\begin{aligned} A x = b \\ A = Q R \\ Q R x = b \\ Q^{- 1} = Q^{T} \\ Q^{T} Q R x = Q^{T} b \\ R x = Q^{T} b \end{aligned}

Then once again,

R x

is easier to solve than

A x

def ls_qr(A,b):
    Q, R = scipy.linalg.qr(A, mode='economic')
    return scipy.linalg.solve_triangular(R, Q.T @ b)

coeffs_qr = ls_qr(trn_int, y_trn)
regr_metrics(y_test, test_int @ coeffs_qr)

4. SVD

都是同样的问题，要求 $x$ 。 $U$ 的column orthonormal， $\sum$ 是对角阵， $V$ 是row orthonormal.

\begin{aligned} A x = b \\ A = U Σ V \\ Σ V x = U^{T} b \\ Σ w = U^{T} b \\ x = V^{T} w \end{aligned}

解

Σ w = U^{T} b

甚至比上三角矩阵

R

更简单（每行只有一个未知量）。然后

V x = w

又可以直接用orthonormal的性质，直接等于

V^{T} w

。

SVD gives the pseudo-inverse.

def ls_svd(A,b):
    m, n = A.shape
    U, sigma, Vh = scipy.linalg.svd(A, full_matrices=False, lapack_driver='gesdd')
    w = (U.T @ b)/ sigma
    return Vh.T @ w

coeffs_svd = ls_svd(trn_int, y_trn)
regr_metrics(y_test, test_int @ coeffs_svd)

5. Timing Comparison

Matrix Inversion: $2 n^{3}$

Matrix Multiplication: $n^{3}$

Cholesky: $\frac{1}{3} n^{3}$

QR, Gram Schmidt: $2 m n^{2}$ , $m \geq n$ (chapter 8 of Trefethen)

QR, Householder: $2 m n^{2} - \frac{2}{3} n^{3}$ (chapter 10 of Trefethen)

Solving a triangular system: $n^{2}$

线性回归的N种姿势
Why Cholesky Factorization is Fast:

(source: Stanford Convex Optimization: Numerical Linear Algebra Background Slides)

线性回归的N种姿势

1. Naive Solution

2. Cholesky Factorization

3. QR Factorization

4. SVD

5. Timing Comparison

利用CSS3实现进度条的两种姿势详解

C#直线的最小二乘法线性回归运算实例

C#取得Web程序和非Web程序的根目录的N种取法总结

Python中实现单例模式的n种方式和原理

Python scikit-learn 做线性回归的示例代码

利用CSS3实现进度条的两种姿势详解

C#直线的最小二乘法线性回归运算实例

基于python中theano库的线性回归

你真的了解果冻的n种吃法吗

涨姿势这十种养生常识竟然是错的

线性回归的N种姿势

1. Naive Solution

2. Cholesky Factorization

3. QR Factorization

4. SVD

5. Timing Comparison

利用CSS3实现进度条的两种姿势详解

C#直线的最小二乘法线性回归运算实例

C#取得Web程序和非Web程序的根目录的N种取法总结

Python中实现单例模式的n种方式和原理

Python scikit-learn 做线性回归的示例代码

利用CSS3实现进度条的两种姿势详解

C#直线的最小二乘法线性回归运算实例

基于python中theano库的线性回归

你真的了解果冻的n种吃法吗

涨姿势 这十种养生常识竟然是错的

涨姿势这十种养生常识竟然是错的