人工智能 - 自编码器（AutoEncoder）

程序员文章站 2024-03-14 21:17:35

...

欢迎Follow我的GitHub

本文地址：http://blog.csdn.net/caroline_wendy/article/details/77639573

自编码器，使用稀疏的高阶特征重新组合，来重构自己，输入与输出一致。

TensorFlow的框架，参考

源码，同时，复制autoencoder_models的模型文件。

本文源码的GitHub地址

人工智能 - 自编码器（AutoEncoder）

工程配置

下载Python的依赖库：scikit-learn==0.19.0、scipy==0.19.1、sklearn==0.0

scipy

如果安装scipy出错，则把scipy==0.19.1写入requestments.txt，再安装，错误如下：

THESE PACKAGES DO NOT MATCH THE HASHES FROM THE REQUIREMENTS FILE. If you have updated the package versions, please update the hashes. Otherwise, examine the package contents carefully; someone may have tampered with them.
    scipy from http://mirrors.aliyun.com/pypi/packages/63/68/c5098f3b6034e69d187e3f2e989f462143d9f8b524f5a4f9e13c4a6f5f47/scipy-0.19.1-cp27-cp27m-macosx_10_6_intel.macosx_10_9_intel.macosx_10_9_x86_64.macosx_10_10_intel.macosx_10_10_x86_64.whl#md5=72415e8da753eea97eb9820602931cb5:
        Expected md5 72415e8da753eea97eb9820602931cb5
             Got        073584eb2c597bbfb82a5865b7055787

或者，直接编写requestments.txt，全部安装

pip install -r requirements.txt

matplotlib

安装matplotlib

pip install matplotlib -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com

如果安装matplotlib报错，如下：

RuntimeError: Python is not installed as a framework. The Mac OS X backend will not be able to function correctly if Python is not installed as a framework. See the Python documentation for more information on installing Python as a framework on Mac OS X. Please either reinstall Python as a framework, or try one of the other backends. If you are using (Ana)Conda please install python.app and replace the use of 'python' with 'pythonw'. See 'Working with Matplotlib on OSX' in the Matplotlib FAQ for more information.

则执行Shell命令

cd ~/.matplotlib
touch matplotlibrc

导入matplotlib

import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt

opencv

opencv的导入库是cv2，安装是opencv-python

sudo pip install opencv-python -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com

导入cv2，如果直接使用import cv2，则无法自动补全，导入时应该使用：

import cv2.cv2 as cv2

图片存储

获取MNIST的图片源，test表示测试集，train表示训练集，images表示图片集，labels表示标签集。images的数据类型是ndarry，784维；labels的数据类型也是ndarray，one-hot类型。

# 加载数据
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
images = mnist.test.images  # 图片
labels = mnist.test.labels  # 标签

将784维的一阶矩阵转换为28维的二阶图片，将one-hot标签转换为数字（0~9），存储test的前100张图片。

# 存储图片
size = len(labels)
for i in range(size):
    pxl = np.array(images[i])  # 像素
    img = pxl.reshape((28, 28))  # 图片
    lbl = np.argmax(labels[i])  # 标签
    misc.imsave('./IMAGE_data/test/' + str(i) + '_' + str(lbl) + '.png', img)  # scipy的存储模式
    if i == 100:
        break

合并100张图片为一张图片，便于做对比。

# 合并图片
large_size = 28 * 10
large_img = Image.new('RGBA', (large_size, large_size))
paths_list, _, __ = listdir_files('./IMAGE_data/test/')
for i in range(100):
    img = Image.open(paths_list[i])
    loc = ((int(i / 10) * 28), (i % 10) * 28)
    large_img.paste(img, loc)
large_img.save('./IMAGE_data/merged.png')

图片的三种存储方式：scipy、matplotlib（含坐标）、opencv。

# 其他的图片存储方式
pixel = np.array(images[0])  # 784维的数据
label = np.argmax(labels[0])  # 找到标签
image = pixel.reshape((28, 28))  # 转换成28*28维的矩阵

# -------------------- scipy模式 -------------------- #
misc.imsave('./IMAGE_data/scipy.png', image)  # scipy的存储模式
# -------------------- scipy模式 -------------------- #

# -------------------- matplotlib模式 -------------------- #
plt.gray()  # 转变为灰度图片
plt.imshow(image)
plt.savefig("./IMAGE_data/plt.png")
# plt.show()
# -------------------- matplotlib模式 -------------------- #

# -------------------- opencv模式 -------------------- #
image = image * 255  # 数据是0~1的浮点数
cv2.imwrite("./IMAGE_data/opencv.png", image)
# cv2.imshow('hah', pixels)
# cv2.waitKey(0)
# -------------------- opencv模式 -------------------- #

自编码器

读取MNIST的数据

mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

将训练数据与测试数据标准化

X_train, X_test = standard_scale(mnist.train.images, mnist.test.images)

以训练数据为标准，计算均值和标准差，然后处理训练数据与测试数据。

def standard_scale(X_train, X_test):
    preprocessor = prep.StandardScaler().fit(X_train)
    X_train = preprocessor.transform(X_train)
    X_test = preprocessor.transform(X_test)
    return X_train, X_test

在StandardScaler中，mean_表示均值矩阵，与图片维数一致；scale_表示标准差，也与图片维数一致；矩阵中每一个数字都减去对应的均值，除以对应的标准差。

self.scale_ = _handle_zeros_in_scale(np.sqrt(self.var_))
X -= self.mean_
X /= self.scale_

设置训练参数：n_samples全部样本个数，training_epochs迭代次数，batch_size批次的样本数，display_step显示步数。

n_samples = int(mnist.train.num_examples)
training_epochs = 20
batch_size = 128
display_step = 1

AdditiveGaussianNoiseAutoencoder，简称AGN，加高斯噪声的自动编码器。n_input输入节点数，与图片维数相同，784维；n_hidden隐含层的节点数，需要小于输入节点数，200维；transfer_function**函数，tf.nn.softplus；optimizer优化器，AdamOptimizer，学习率是0.001；scale噪声系数，0.01。

autoencoder = AdditiveGaussianNoiseAutoencoder(
    n_input=784, n_hidden=200, transfer_function=tf.nn.softplus,
    optimizer=tf.train.AdamOptimizer(learning_rate=0.001), scale=0.01)

关于**函数softplus的原理如下：

mat = [1., 2., 3.]  # 需要使用小数
# softplus: [ln(e^1 + 1), ln(e^2 + 1), ln(e^3 + 1)]
print tf.Session().run(tf.nn.softplus(mat))

random_normal生成随机的正态分布数组

rn = tf.random_normal((100000,))  # 一行，指定seed，防止均值的时候随机
mean, variance = tf.nn.moments(rn, 0)  # 计算均值和方差，预期均值约等于是0，方差是1
print tf.Session().run(tf.nn.moments(rn, 0))

AdditiveGaussianNoiseAutoencoder的构造器

def __init__(self, n_input, n_hidden, transfer_function=tf.nn.softplus, optimizer=tf.train.AdamOptimizer(),
             scale=0.1):
    self.n_input = n_input  # 输入的节点数
    self.n_hidden = n_hidden  # 隐含层节点数，小于输入节点数
    self.transfer = transfer_function  # **函数
    self.scale = tf.placeholder(tf.float32)  # 系数，待训练的参数，初始的feed数据是training_scale
    self.training_scale = scale  # 高斯噪声系数
    network_weights = self._initialize_weights()  # 初始化权重系数，输入层w1/b1，输出层w2/b2
    self.weights = network_weights  # 权重

    # model
    self.x = tf.placeholder(tf.float32, [None, self.n_input])  # 需要feed的数据
    self.hidden = self.transfer(tf.add(tf.matmul(self.x + scale * tf.random_normal((n_input,)),
                                                 self.weights['w1']),
                                       self.weights['b1']))
    self.reconstruction = tf.add(tf.matmul(self.hidden, self.weights['w2']), self.weights['b2'])

    # cost，0.5*(x - x_)^2，求和
    self.cost = 0.5 * tf.reduce_sum(tf.pow(tf.subtract(self.reconstruction, self.x), 2.0))
    self.optimizer = optimizer.minimize(self.cost)

    init = tf.global_variables_initializer()
    self.sess = tf.Session()
    self.sess.run(init)  # 执行图

random_normal随机生成矩阵，参数(n_input,)，n_input行1列，均值为0，方差为1，tf.nn.moments，返回均值和方差。

rn = tf.random_normal((100000,))  # 一行，指定seed，防止均值的时候随机
mean, variance = tf.nn.moments(rn, 0)  # 计算均值和方差，预期均值约等于是0，方差是1
print tf.Session().run(tf.nn.moments(rn, 0))

初始化权重，分为两层，将n_input维的数据转换为n_hidden维的数据，再反向转换回去。初始权重初始化使用xavier_initializer（泽维尔初始化器），权重的均值为1，方差为1/(n_input+n_hidden)。

def _initialize_weights(self):
    all_weights = dict()
    # 使用xavier_initializer初始化
    all_weights['w1'] = tf.get_variable("w1", shape=[self.n_input, self.n_hidden],
                                        initializer=tf.contrib.layers.xavier_initializer())
    all_weights['b1'] = tf.Variable(tf.zeros([self.n_hidden], dtype=tf.float32))
    all_weights['w2'] = tf.Variable(tf.zeros([self.n_hidden, self.n_input], dtype=tf.float32))
    all_weights['b2'] = tf.Variable(tf.zeros([self.n_input], dtype=tf.float32))
    return all_weights

训练模型，输出每个轮次的平均avg_cost，

for epoch in range(training_epochs):
    avg_cost = 0.
    total_batch = int(n_samples / batch_size)
    # Loop over all batches
    for i in range(total_batch):
        batch_xs = get_random_block_from_data(X_train, batch_size)

        # Fit training using batch data
        cost = autoencoder.partial_fit(batch_xs)
        # Compute average loss
        avg_cost += cost / n_samples * batch_size

    # Display logs per epoch step
    if epoch % display_step == 0:
        print("Epoch:", '%04d' % (epoch + 1), "cost=", "{:.9f}".format(avg_cost))

print("Total cost: " + str(autoencoder.calc_total_cost(X_test)))

随机获取起始位置，取区块大小的一批数据。

def get_random_block_from_data(data, batch_size):
    start_index = np.random.randint(0, len(data) - batch_size)  # 随机获取区块
    return data[start_index:(start_index + batch_size)]  # batch_size大小的区块

调用autoencoder的partial_fit，向算法Feed数据，数据就是批次数据，高斯噪声系数使用默认。

def partial_fit(self, X):
    cost, opt = self.sess.run((self.cost, self.optimizer),
                              feed_dict={self.x: X, self.scale: self.training_scale})
    return cost

最终输出整个测试集X_test的Cost值。

print("Total cost: " + str(autoencoder.calc_total_cost(X_test)))

原图像的效果（100张）：

人工智能 - 自编码器（AutoEncoder）

自编码器的效果（100张）：

人工智能 - 自编码器（AutoEncoder）

OK，that‘s all! Enjoy it!

相关标签： Mystra 人工智能机器学习自编码器 TensorFlow

上一篇： keras学习实例（二）：mnist 手写体分类

下一篇： Single shot object detection SSD using MobileNet and OpenCV

人工智能 - 自编码器（AutoEncoder）

工程配置

scipy

matplotlib

opencv

图片存储

自编码器