欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

程序员文章站 2024-02-02 12:33:52
tensorflow是目前最流行的深度学习框架,我们可以用它来搭建自己的卷积神经网络并训练自己的分类器,本文介绍怎样使用tensorflow构建自己的cnn,怎样训练用于简单的验证码识...

tensorflow是目前最流行的深度学习框架,我们可以用它来搭建自己的卷积神经网络并训练自己的分类器,本文介绍怎样使用tensorflow构建自己的cnn,怎样训练用于简单的验证码识别的分类器。本文假设你已经安装好了tensorflow,了解过cnn的一些知识。

下面将分步介绍怎样获得训练数据,怎样使用tensorflow构建卷积神经网络,怎样训练,以及怎样测试训练出来的分类器

1. 准备训练样本

使用python的库captcha来生成我们需要的训练样本,代码如下:

import sys 

import os 
import shutil 
import random 
import time 
#captcha是用于生成验证码图片的库,可以 pip install captcha 来安装它 
from captcha.image import imagecaptcha 
 
#用于生成验证码的字符集 
char_set = ['0','1','2','3','4','5','6','7','8','9'] 
#字符集的长度 
char_set_len = 10 
#验证码的长度,每个验证码由4个数字组成 
captcha_len = 4 
 
#验证码图片的存放路径 
captcha_image_path = 'e:/tensorflow/captcha/images/' 
#用于模型测试的验证码图片的存放路径,它里面的验证码图片作为测试集 
test_image_path = 'e:/tensorflow/captcha/test/' 
#用于模型测试的验证码图片的个数,从生成的验证码图片中取出来放入测试集中 
test_image_number = 50 
 
#生成验证码图片,4位的十进制数字可以有10000种验证码 
def generate_captcha_image(charset = char_set, charsetlen=char_set_len, captchaimgpath=captcha_image_path):   
  k = 0 
  total = 1 
  for i in range(captcha_len): 
    total *= charsetlen 
     
  for i in range(charsetlen): 
    for j in range(charsetlen): 
      for m in range(charsetlen): 
        for n in range(charsetlen): 
          captcha_text = charset[i] + charset[j] + charset[m] + charset[n] 
          image = imagecaptcha() 
          image.write(captcha_text, captchaimgpath + captcha_text + '.jpg') 
          k += 1 
          sys.stdout.write("\rcreating %d/%d" % (k, total)) 
          sys.stdout.flush() 
           
#从验证码的图片集中取出一部分作为测试集,这些图片不参加训练,只用于模型的测试           
def prepare_test_set(): 
  filenamelist = []   
  for filepath in os.listdir(captcha_image_path): 
    captcha_name = filepath.split('/')[-1] 
    filenamelist.append(captcha_name) 
  random.seed(time.time()) 
  random.shuffle(filenamelist)  
  for i in range(test_image_number): 
    name = filenamelist[i] 
    shutil.move(captcha_image_path + name, test_image_path + name) 
             
if __name__ == '__main__': 
  generate_captcha_image(char_set, char_set_len, captcha_image_path) 
  prepare_test_set() 
  sys.stdout.write("\nfinished") 
  sys.stdout.flush()  

运行上面的代码,可以生成验证码图片,

生成的验证码图片如下图所示:

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

2. 构建cnn,训练分类器

代码如下:

import tensorflow as tf 
import numpy as np 
from pil import image 
import os 
import random 
import time 
 
#验证码图片的存放路径 
captcha_image_path = 'e:/tensorflow/captcha/images/' 
#验证码图片的宽度 
captcha_image_widht = 160 
#验证码图片的高度 
captcha_image_height = 60 
 
char_set_len = 10 
captcha_len = 4 
 
#60%的验证码图片放入训练集中 
train_image_percent = 0.6 
#训练集,用于训练的验证码图片的文件名 
training_image_name = [] 
#验证集,用于模型验证的验证码图片的文件名 

validation_image_name = [] 

#存放训练好的模型的路径 
model_save_path = 'e:/tensorflow/captcha/models/' 
 
def get_image_file_name(imgpath=captcha_image_path): 
  filename = [] 
  total = 0 
  for filepath in os.listdir(imgpath): 
    captcha_name = filepath.split('/')[-1] 
    filename.append(captcha_name) 
    total += 1 
  return filename, total 
   
#将验证码转换为训练时用的标签向量,维数是 40   
#例如,如果验证码是 ‘0296' ,则对应的标签是 
# [1 0 0 0 0 0 0 0 0 0 
# 0 0 1 0 0 0 0 0 0 0 
# 0 0 0 0 0 0 0 0 0 1 
# 0 0 0 0 0 0 1 0 0 0] 
def name2label(name): 
  label = np.zeros(captcha_len * char_set_len) 
  for i, c in enumerate(name): 
    idx = i*char_set_len + ord(c) - ord('0') 
    label[idx] = 1 
  return label 
   
#取得验证码图片的数据以及它的标签     
def get_data_and_label(filename, filepath=captcha_image_path): 
  pathname = os.path.join(filepath, filename) 
  img = image.open(pathname) 
  #转为灰度图 
  img = img.convert("l")     
  image_array = np.array(img)   
  image_data = image_array.flatten()/255 
  image_label = name2label(filename[0:captcha_len]) 
  return image_data, image_label 
   
#生成一个训练batch   
def get_next_batch(batchsize=32, trainortest='train', step=0): 
  batch_data = np.zeros([batchsize, captcha_image_widht*captcha_image_height]) 
  batch_label = np.zeros([batchsize, captcha_len * char_set_len]) 
  filenamelist = training_image_name 
  if trainortest == 'validate':     
    filenamelist = validation_image_name 
     
  totalnumber = len(filenamelist)  
  indexstart = step*batchsize   
  for i in range(batchsize): 
    index = (i + indexstart) % totalnumber 
    name = filenamelist[index]     
    img_data, img_label = get_data_and_label(name) 
    batch_data[i, : ] = img_data 
    batch_label[i, : ] = img_label  
 
  return batch_data, batch_label 
   
#构建卷积神经网络并训练 
def train_data_with_cnn(): 
  #初始化权值 
  def weight_variable(shape, name='weight'): 
    init = tf.truncated_normal(shape, stddev=0.1) 
    var = tf.variable(initial_value=init, name=name) 
    return var 
  #初始化偏置   
  def bias_variable(shape, name='bias'): 
    init = tf.constant(0.1, shape=shape) 
    var = tf.variable(init, name=name) 
    return var 
  #卷积   
  def conv2d(x, w, name='conv2d'): 
    return tf.nn.conv2d(x, w, strides=[1,1,1,1], padding='same', name=name) 
  #池化  
  def max_pool_2x2(x, name='maxpool'): 
    return tf.nn.max_pool(x, ksize=[1,2,2,1], strides=[1,2,2,1], padding='same', name=name)    
   
  #输入层 
  #请注意 x 的 name,在测试model时会用到它 
  x = tf.placeholder(tf.float32, [none, captcha_image_widht * captcha_image_height], name='data-input') 
  y = tf.placeholder(tf.float32, [none, captcha_len * char_set_len], name='label-input')   
  x_input = tf.reshape(x, [-1, captcha_image_height, captcha_image_widht, 1], name='x-input') 
  #dropout,防止过拟合 
  #请注意 keep_prob 的 name,在测试model时会用到它 
  keep_prob = tf.placeholder(tf.float32, name='keep-prob') 
  #第一层卷积 
  w_conv1 = weight_variable([5,5,1,32], 'w_conv1') 
  b_conv1 = bias_variable([32], 'b_conv1') 
  conv1 = tf.nn.relu(conv2d(x_input, w_conv1, 'conv1') + b_conv1) 
  conv1 = max_pool_2x2(conv1, 'conv1-pool') 
  conv1 = tf.nn.dropout(conv1, keep_prob) 
  #第二层卷积 
  w_conv2 = weight_variable([5,5,32,64], 'w_conv2') 
  b_conv2 = bias_variable([64], 'b_conv2') 
  conv2 = tf.nn.relu(conv2d(conv1, w_conv2,'conv2') + b_conv2) 
  conv2 = max_pool_2x2(conv2, 'conv2-pool') 
  conv2 = tf.nn.dropout(conv2, keep_prob) 
  #第三层卷积 
  w_conv3 = weight_variable([5,5,64,64], 'w_conv3') 
  b_conv3 = bias_variable([64], 'b_conv3') 
  conv3 = tf.nn.relu(conv2d(conv2, w_conv3, 'conv3') + b_conv3) 
  conv3 = max_pool_2x2(conv3, 'conv3-pool') 
  conv3 = tf.nn.dropout(conv3, keep_prob) 
  #全链接层 
  #每次池化后,图片的宽度和高度均缩小为原来的一半,进过上面的三次池化,宽度和高度均缩小8倍 
  w_fc1 = weight_variable([20*8*64, 1024], 'w_fc1') 
  b_fc1 = bias_variable([1024], 'b_fc1') 
  fc1 = tf.reshape(conv3, [-1, 20*8*64]) 
  fc1 = tf.nn.relu(tf.add(tf.matmul(fc1, w_fc1), b_fc1)) 
  fc1 = tf.nn.dropout(fc1, keep_prob) 
  #输出层 
  w_fc2 = weight_variable([1024, captcha_len * char_set_len], 'w_fc2') 
  b_fc2 = bias_variable([captcha_len * char_set_len], 'b_fc2') 
  output = tf.add(tf.matmul(fc1, w_fc2), b_fc2, 'output') 
   
  loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(labels=y, logits=output)) 
  optimizer = tf.train.adamoptimizer(0.001).minimize(loss) 
   
  predict = tf.reshape(output, [-1, captcha_len, char_set_len], name='predict') 
  labels = tf.reshape(y, [-1, captcha_len, char_set_len], name='labels') 
  #预测结果 
  #请注意 predict_max_idx 的 name,在测试model时会用到它 
  predict_max_idx = tf.argmax(predict, axis=2, name='predict_max_idx') 
  labels_max_idx = tf.argmax(labels, axis=2, name='labels_max_idx') 
  predict_correct_vec = tf.equal(predict_max_idx, labels_max_idx) 
  accuracy = tf.reduce_mean(tf.cast(predict_correct_vec, tf.float32)) 
   
  saver = tf.train.saver() 
  with tf.session() as sess: 
    sess.run(tf.global_variables_initializer()) 
    steps = 0 
    for epoch in range(6000): 
      train_data, train_label = get_next_batch(64, 'train', steps) 
      sess.run(optimizer, feed_dict={x : train_data, y : train_label, keep_prob:0.75}) 
      if steps % 100 == 0: 
        test_data, test_label = get_next_batch(100, 'validate', steps) 
        acc = sess.run(accuracy, feed_dict={x : test_data, y : test_label, keep_prob:1.0}) 
        print("steps=%d, accuracy=%f" % (steps, acc)) 
        if acc > 0.99: 
          saver.save(sess, model_save_path+"crack_captcha.model", global_step=steps) 
          break 
      steps += 1 
 
if __name__ == '__main__':   
  image_filename_list, total = get_image_file_name(captcha_image_path) 
  random.seed(time.time()) 
  #打乱顺序 
  random.shuffle(image_filename_list) 
  trainimagenumber = int(total * train_image_percent) 
  #分成测试集 
  training_image_name = image_filename_list[ : trainimagenumber] 
  #和验证集 
  validation_image_name = image_filename_list[trainimagenumber : ] 
  train_data_with_cnn()   
  print('training finished') 

运行上面的代码,开始训练,训练要花些时间,如果没有gpu的话,会慢些,

训练完后,输出如下结果,经过4100次的迭代,训练出来的分类器模型在验证集上识别的准确率为99.5%

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

生成的模型文件如下,在模型测试时将用到这些文件

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

3. 测试模型

编写代码,对训练出来的模型进行测试

import tensorflow as tf 

import numpy as np 
from pil import image 
import os 
import matplotlib.pyplot as plt  
 
captcha_len = 4 
 
model_save_path = 'e:/tensorflow/captcha/models/' 
test_image_path = 'e:/tensorflow/captcha/test/' 
 
def get_image_data_and_name(filename, filepath=test_image_path): 
  pathname = os.path.join(filepath, filename) 
  img = image.open(pathname) 
  #转为灰度图 
  img = img.convert("l")     
  image_array = np.array(img)   
  image_data = image_array.flatten()/255 
  image_name = filename[0:captcha_len] 
  return image_data, image_name 
 
def digitalstr2array(digitalstr): 
  digitallist = [] 
  for c in digitalstr: 
    digitallist.append(ord(c) - ord('0')) 
  return np.array(digitallist) 
 
def model_test(): 
  namelist = [] 
  for pathname in os.listdir(test_image_path): 
    namelist.append(pathname.split('/')[-1]) 
  totalnumber = len(namelist) 
  #加载graph 
  saver = tf.train.import_meta_graph(model_save_path+"crack_captcha.model-4100.meta") 
  graph = tf.get_default_graph() 
  #从graph取得 tensor,他们的name是在构建graph时定义的(查看上面第2步里的代码) 
  input_holder = graph.get_tensor_by_name("data-input:0") 
  keep_prob_holder = graph.get_tensor_by_name("keep-prob:0") 
  predict_max_idx = graph.get_tensor_by_name("predict_max_idx:0") 
  with tf.session() as sess: 
    saver.restore(sess, tf.train.latest_checkpoint(model_save_path)) 
    count = 0 
    for filename in namelist: 
      img_data, img_name = get_image_data_and_name(filename, test_image_path) 
      predict = sess.run(predict_max_idx, feed_dict={input_holder:[img_data], keep_prob_holder : 1.0})       
      filepathname = test_image_path + filename 
      print(filepathname) 
      img = image.open(filepathname) 
      plt.imshow(img) 
      plt.axis('off') 
      plt.show() 
      predictvalue = np.squeeze(predict) 
      rightvalue = digitalstr2array(img_name) 
      if np.array_equal(predictvalue, rightvalue): 
        result = '正确' 
        count += 1 
      else:  
        result = '错误'       
      print('实际值:{}, 预测值:{},测试结果:{}'.format(rightvalue, predictvalue, result)) 
      print('\n') 
       
    print('正确率:%.2f%%(%d/%d)' % (count*100/totalnumber, count, totalnumber)) 
 
if __name__ == '__main__': 
  model_test() 

对模型的测试结果如下,在测试集上识别的准确率为 94%

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

下面是两个识别错误的验证码

利用Tensorflow构建和训练自己的CNN来做简单的验证码识别方式

以上这篇利用tensorflow构建和训练自己的cnn来做简单的验证码识别方式就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持。