吴恩达机器学习 EX1 作业 第二部分多变量线性回归
程序员文章站
2022-06-16 20:18:12
...
2多变量线性回归
2.1作业介绍
在本部分中,您将使用多个变量实现线性回归来预测房价。假设你在卖房子,你想知道一个好的市场价格是多少。其中一种方法是首先收集最近售出房屋的信息,并建立一个房价模型。
文件ex1data2.txt(数据集请到网上自行下载)包含俄勒冈州波特兰市的房价训练集。第一栏是房子的大小(以平方英尺为单位),第二栏是卧室的数量,第三栏是房子的价格
2.2 导入模块
import matplotlib.pyplot as plt
import numpy as np
from featureNormalize import * #正则化模块
from gradientDescent import * # 批量梯度下降模块
from normalEqn import * # 正规方程模块
2.3 导入数据
plt.ion()
# ===================== Part 1: Feature Normalization =====================
data = np.loadtxt('ex1data2.txt', delimiter=',', dtype=np.int64)
X = data[:, 0:2]
y = data[:, 2]
m = y.size
2.4 查看前10条训练样本和输出样本
# Print out some data points
print('First 10 examples from the dataset: ')
for i in range(0, 10):
print('x = {}, y = {}'.format(X[i], y[i]))
First 10 examples from the dataset:
x = [2104 3], y = 399900
x = [1600 3], y = 329900
x = [2400 3], y = 369000
x = [1416 2], y = 232000
x = [3000 4], y = 539900
x = [1985 4], y = 299900
x = [1534 3], y = 314900
x = [1427 3], y = 198999
x = [1380 3], y = 212000
x = [1494 3], y = 242500
2.5 正则化函数(featureNormalize.py)
import numpy as np
def feature_normalize(X):
n = X.shape[1] # the number of features
X_norm = X
mu = np.zeros(n)
sigma = np.zeros(n)
mu = np.mean(X, axis=0) # 计算X轴方向样本的平均值
sigma = np.std(X, axis=0) # 计算X轴方向样本的标准差
X_norm = (X - mu) / sigma # 对样本进行正则化
return X_norm, mu, sigma
2.6 对样本进行标准化处理
a、标准化处理不包括偏置(bias)单元,标准化处理后再增加偏置单元。
b、标准化处理只处理训练样本,不对输出样本进行标准化处理
# Scale features and set them to zero mean
X, mu, sigma = feature_normalize(X)
X = np.c_[np.ones(m), X] # Add a column of ones to X
2.7 用批量梯度下降算法计算代价值和更新theta
单变量批量梯度下降和多变量批量梯度下降算法相同,代价函数算法相同,详见ex1 第一部分相关内容
# Choose some alpha value
alpha = 0.03
num_iters = 400
# Init theta and Run Gradient Descent
theta = np.zeros(3)
theta, J_history = gradient_descent_multi(X, y, theta, alpha, num_iters)
2.8 绘制迭代训练过程代价值
# Plot the convergence graph
plt.figure()
plt.plot(np.arange(J_history.size), J_history)
plt.xlabel('Number of iterations')
plt.ylabel('Cost J')
Text(0,0.5,‘Cost J’)
2.9 打印批量梯度下降更新后的theta
# Display gradient descent's result
print('Theta computed from gradient descent : \n{}'.format(theta))
Theta computed from gradient descent :
[340410.91897274 109162.68848142 -6293.24735132]
2.10 用更新后的theta预测房价
正则化预测样本
# Estimate the price of a 1650 sq-ft, 3 br house
# ===================== Your Code Here =====================
# Recall that the first column of X is all-ones. Thus, it does
# not need to be normalized.
x_p = np.array([1650, 3])
x_p_nor = (x_p - mu) / sigma
预测样本加偏置单元(1)进行预测
price = np.dot(np.r_[1, x_p_nor], theta[:, np.newaxis]) # You should change this
打印预测房价
print('Predicted price of a 1650 sq-ft, 3 br house (using gradient descent) : %0.3f' % (price))
Predicted price of a 1650 sq-ft, 3 br house (using gradient descent) : 293142.433
2.11 正规方程计算theta
# Load data
data = np.loadtxt('ex1data2.txt', delimiter=',', dtype=np.int64)
X = data[:, 0:2]
y = data[:, 2]
m = y.size
# Add intercept term to X
X = np.c_[np.ones(m), X]
2.12 正规方程函数
只要特征变量的数目并不大,标准方程是一个很好的计算参数theta的替代方法。具体地说,只要特征变量数量小于一万,通常使用标准方程法,而不使用梯度下降法
正规方程公式如下:
import numpy as np
def normal_eqn(X, y):
theta = np.zeros((X.shape[1], 1))
theta = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)
return theta
用正规方程计算theta
theta = normal_eqn(X, y)
# Display normal equation's result
print('Theta computed from the normal equations : \n{}'.format(theta))
Theta computed from the normal equations :
[89597.9095428 139.21067402 -8738.01911233]
用正规方程计算的theta预测房价,和批量梯度下降算法计算的theta预测房价差不多
# Estimate the price of a 1650 sq-ft, 3 br house
# ===================== Your Code Here =====================
price = np.dot(np.array([1, 1650, 3]), theta.T)
# ==========================================================
print('Predicted price of a 1650 sq-ft, 3 br house (using normal equations) : {:0.3f}'.format(price))
Predicted price of a 1650 sq-ft, 3 br house (using normal equations) : 293081.464
前一篇 EX1第一部分单变量线性回归
后一篇 EX2第一部分逻辑回归
上一篇: 请好手帮忙写个正则