欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

tensorflow入门之mnist手写数据集识别

程序员文章站 2024-03-07 21:45:27
...

最近开始研究机器学习,整个模型都自己写的话不太现实,所以还是得用框架。几经查找,选择了Google的Tensorflow框架,这个起步也还比较好用,网上参考资料也很多。
参考的教程官网:tensorflow
环境安装的话,我用的python3.6,这里强烈推荐安装个Anaconda,python包管理工具,用起来特别方便,切换python版本什么的也很简单。python库管理也是都可视化的。
在官网下载Anaconda安装包,安装完成之打开Navigator,如下:
tensorflow入门之mnist手写数据集识别
搜索框输入tensorflow然后直接安装就行。
IDE我用的VSCode,安装一下python插件就可以用了。
然后开始入门的教程学习:
首先数据集要从mnist网站上下下来。下面是链接
mnist数据集下载地址
train-images-idx3-ubyte.gz: 训练数据集手写数字图片
train-labels-idx1-ubyte.gz: 训练数据集标签(对应于图片的答案)
t10k-images-idx3-ubyte.gz: 测试数据集图片
t10k-labels-idx1-ubyte.gz: 测试数据集标签

下载下来解压,这是用python struct打包了的byte文件,我们需要用代码再把它解析出来然后转成向量数组便于tensorflow引用,可以在教程上看到,图片集需要转成[60000,784]的矩阵,一个[,784]代表一张图片,即28*28展开成一维数组,这就是训练样本x,初步的计算公式就是y = Wx +b。W是权重,这个就是要不断训练得出来的最优解,b是偏移量,这些都是入门所需知识,在此便不多加赘述。
数据集解析比较麻烦,下面直接贴代码:


    def read_train_image(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.train_img_num, self.numRows,self.numColums = struct.unpack_from('>IIII',buf,index)
        self.train_img_list = np.zeros((self.train_img_num, 28 * 28))
        index += struct.calcsize('>IIII')
        # print (magic, ' ', self.train_img_num, ' ', self.numRows, ' ', self.numColums)

        for i in range(self.train_img_num):
            im = struct.unpack_from('>784B',buf,index)
            index += struct.calcsize('>784B')
            im = np.array(im)
            # print(im)
            im = im/255
            im = im.reshape(1,28*28)
            # im = im.reshape(28,28)
            self.train_img_list[i,:] = im
            # plt.imshow(im,cmap='binary')
            # plt.show()
    def read_train_lable(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.train_label_num = struct.unpack_from('>II',buf,index)
        self.train_label_list = np.zeros((self.train_label_num, 10))
        index += struct.calcsize('>II')
        # print(magic, self.train_label_num)
        for i in range(self.train_label_num):
            lblTemp = np.zeros(10)
            lbl = struct.unpack_from('>1B',buf,index)
            index += struct.calcsize('>1B')
            lbl = np.array(lbl)
            lblTemp[lbl[0]] = 1
            self.train_label_list[i,:] = lblTemp
            # print(lblTemp)

    def next_batch_image(self,batchCount):
        rnd = np.random.randint(1,60000)
        return self.train_img_list[rnd:rnd+batchCount],self.train_label_list[rnd:rnd+batchCount]

    def read_test_image(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.test_img_num, self.numRows,self.numColums = struct.unpack_from('>IIII',buf,index)
        self.test_img_list = np.zeros((self.test_img_num, 28 * 28))
        index += struct.calcsize('>IIII')
        # print (magic, ' ', self.test_img_num, ' ', self.numRows, ' ', self.numColums)

        for i in range(self.test_img_num):
            im = struct.unpack_from('>784B',buf,index)
            index += struct.calcsize('>784B')
            im = np.array(im)
            im = im/255
            im = im.reshape(1,28*28)
            # im = im.reshape(28,28)
            self.test_img_list[i,:] = im

    def read_test_lable(self,filename):
        index = 0
        binfile = open(filename,'rb')
        buf = binfile.read()
        magic, self.test_label_num = struct.unpack_from('>II',buf,index)
        self.test_label_list = np.zeros((self.test_label_num, 10))
        index += struct.calcsize('>II')
        # print(magic, self.test_label_num)  #train
        for i in range(self.test_label_num):
            lblTemp = np.zeros(10)
            lbl = struct.unpack_from('>1B',buf,index)
            index += struct.calcsize('>1B')
            lbl = np.array(lbl)
            lblTemp[lbl[0]] = 1
            self.test_label_list[i,:] = lblTemp
            # print(lblTemp)

其中四个方法分别返回四个数据集展开来的向量。因为数据样本有限以及模型比较简单,所以教程采用了随机训练(随机梯度下降训练),所以有个next_batch方法是返回随机100个(连续)样本。这里也可以改进成不连续的。
然后就是tensorflow这边的训练模型代码:



def tfOperate():

    filename_t_image  = "D:\\PY_Image\\handnum\\train-images.idx3-ubyte"
    filename_t_label  = "D:\\PY_Image\\handnum\\train-labels.idx1-ubyte"
    filename_test_image  = "D:\\PY_Image\\handnum\\train-images.idx3-ubyte"
    filename_test_label  = "D:\\PY_Image\\handnum\\train-labels.idx1-ubyte"
    t = DP()
    t.read_train_image(filename_t_image)
    t.read_train_lable(filename_t_label)
    t.read_test_image(filename_test_image)
    t.read_test_lable(filename_test_label)

    # 训练样本image  placeholder  是 n*784  
    x = tf.placeholder("float",[None,784])
    #权重
    W = tf.Variable(tf.zeros([784,10]))
    # bias
    b = tf.Variable(tf.zeros([10]))

    #训练模型   softmax
    y =  tf.nn.softmax(tf.matmul(x,W) + b)

#   交叉熵  cost  or  loss
    y_ = tf.placeholder("float", [None,10])
    cross_entropy = -tf.reduce_sum(y_*tf.log(y))

    # 梯度下降算法  最小化交叉熵
    train_step = tf.train.GradientDescentOptimizer(0.01).minimize(cross_entropy)

    # init = tf.initialize_all_variables()
    init = tf.global_variables_initializer()

    sess = tf.Session()
    sess.run(init)

    for i in range (3000):
        batch_xs,batch_ys = t.next_batch_image(100)
        # print(batch_ys.shape)
        sess.run(train_step,feed_dict = {x:batch_xs,y_:batch_ys})
        if (i%500 == 0 and i >0):
            correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
            accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
            print (sess.run(accuracy, feed_dict={x: t.test_img_list, y_: t.test_label_list}))

最后三句是用来测试识别成功率,大概是0.91左右。

最后顺便说一句,想要把这几个公式都看懂的话,最起码得入门线代,概率论和微积分三门课程。
线代可以学习MIT Gilbert Strang教授的线性代数公开课,网易公开课上有带字幕的。另外俩这上面也有公开课,看个人需求学习,博客写的有点糙。。主要就是打个笔记。
路还很长。。慢慢学。
数据集解析参考链接:

http://blog.csdn.net/supercally/article/details/54236658