在C#下使用TensorFlow.NET训练自己的数据集
今天,我结合代码来详细介绍如何使用 scisharp stack 的 tensorflow.net 来训练cnn模型,该模型主要实现 图像的分类 ,可以直接移植该代码在 cpu 或 gpu 下使用,并针对你们自己本地的图像数据集进行训练和推理。tensorflow.net是基于 .net standard 框架的完整实现的tensorflow,可以支持 .net framework
或 .net core
, tensorflow.net 为广大.net开发者提供了完美的机器学习框架选择。
scisharp stack:https://github.com/scisharp
什么是tensorflow.net?
tensorflow.net 是 scisharp stack 开源社区团队的贡献,其使命是打造一个完全属于.net开发者自己的机器学习平台,特别对于c#开发人员来说,是一个“0”学习成本的机器学习平台,该平台集成了大量api和底层封装,力图使tensorflow的python代码风格和编程习惯可以无缝移植到.net平台,下图是同样tf任务的python实现和c#实现的语法相似度对比,从中读者基本可以略窥一二。
由于tensorflow.net在.net平台的优秀性能,同时搭配scisharp的numsharp、sharpcv、pandas.net、keras.net、matplotlib.net等模块,可以完全脱离python环境使用,目前已经被微软ml.net官方的底层算法集成,并被谷歌写入tensorflow官网教程推荐给全球开发者。
-
scisharp 产品结构
-
微软 ml.net底层集成算法
-
谷歌官方推荐.net开发者使用
url:
项目说明
本文利用tensorflow.net构建简单的图像分类模型,针对工业现场的印刷字符进行单字符ocr识别,从工业相机获取原始大尺寸的图像,前期使用opencv进行图像预处理和字符分割,提取出单个字符的小图,送入tf进行推理,推理的结果按照顺序组合成完整的字符串,返回至主程序逻辑进行后续的生产线工序。
实际使用中,如果你们需要训练自己的图像,只需要把训练的文件夹按照规定的顺序替换成你们自己的图片即可。支持gpu或cpu方式,该项目的完整代码在github如下:
模型介绍
本项目的cnn模型主要由 2个卷积层&池化层 和 1个全连接层 组成,激活函数使用常见的relu,是一个比较浅的卷积神经网络模型。其中超参数之一"学习率",采用了自定义的动态下降的学习率,后面会有详细说明。具体每一层的shape参考下图:
数据集说明
为了模型测试的训练速度考虑,图像数据集主要节选了一小部分的ocr字符(x、y、z),数据集的特征如下:
-
分类数量:3 classes 【x/y/z】
-
图像尺寸:width 64 × height 64
-
图像通道:1 channel(灰度图)
-
数据集数量:
-
train:x - 384pcs ; y - 384pcs ; z - 384pcs
-
validation:x - 96pcs ; y - 96pcs ; z - 96pcs
-
test:x - 96pcs ; y - 96pcs ; z - 96pcs
-
-
其它说明:数据集已经经过 随机 翻转/平移/缩放/镜像 等预处理进行增强
-
整体数据集情况如下图所示:
代码说明
环境设置
-
.net 框架:使用.net framework 4.7.2及以上,或者使用.net core 2.2及以上
-
cpu 配置: any cpu 或 x64 皆可
-
gpu 配置:需要自行配置好cuda和环境变量,建议 cuda v10.1,cudnn v7.5
类库和命名空间引用
-
从nuget安装必要的依赖项,主要是scisharp相关的类库,如下图所示:
注意事项:尽量安装最新版本的类库,cv须使用 scisharp 的 sharpcv 方便内部变量传递
<packagereference include="colorful.console" version="1.2.9" /> <packagereference include="newtonsoft.json" version="12.0.3" /> <packagereference include="scisharp.tensorflow.redist" version="1.15.0" /> <packagereference include="scisharp.tensorflowhub" version="0.0.5" /> <packagereference include="sharpcv" version="0.2.0" /> <packagereference include="sharpziplib" version="1.2.0" /> <packagereference include="system.drawing.common" version="4.7.0" /> <packagereference include="tensorflow.net" version="0.14.0" />
-
引用命名空间,包括 numsharp、tensorflow 和 sharpcv ;
using numsharp; using numsharp.backends; using numsharp.backends.unmanaged; using sharpcv; using system; using system.collections; using system.collections.generic; using system.diagnostics; using system.io; using system.linq; using system.runtime.compilerservices; using tensorflow; using static tensorflow.binding; using static sharpcv.binding; using system.collections.concurrent; using system.threading.tasks;
###
主逻辑结构
主逻辑:
-
准备数据
-
创建计算图
-
训练
-
预测
public bool run() { preparedata(); buildgraph(); using (var sess = tf.session()) { train(sess); test(sess); } testdataoutput(); return accuracy_test > 0.98; }
数据集载入
数据集下载和解压
-
数据集地址:https://github.com/scisharp/scisharp-stack-examples/blob/master/data/data_cnninyourowndata.zip
-
数据集下载和解压代码 ( 部分封装的方法请参考 github完整代码 ):
string url = "https://github.com/scisharp/scisharp-stack-examples/blob/master/data/data_cnninyourowndata.zip"; directory.createdirectory(name); utility.web.download(url, name, "data_cnninyourowndata.zip"); utility.compress.unzip(name + "\\data_cnninyourowndata.zip", name);
字典创建
读取目录下的子文件夹名称,作为分类的字典,方便后面one-hot使用
private void filldictionarylabel(string dirpath) { string[] str_dir = directory.getdirectories(dirpath, "*", searchoption.topdirectoryonly); int str_dir_num = str_dir.length; if (str_dir_num > 0) { dict_label = new dictionary<int64, string>(); for (int i = 0; i < str_dir_num; i++) { string label = (str_dir[i].replace(dirpath + "\\", "")).split('\\').first(); dict_label.add(i, label); print(i.tostring() + " : " + label); } n_classes = dict_label.count; } }
文件list读取和打乱
从文件夹中读取train、validation、test的list,并随机打乱顺序。
-
读取目录
arrayfilename_train = directory.getfiles(name + "\\train", "*.*", searchoption.alldirectories); arraylabel_train = getlabelarray(arrayfilename_train); arrayfilename_validation = directory.getfiles(name + "\\validation", "*.*", searchoption.alldirectories); arraylabel_validation = getlabelarray(arrayfilename_validation); arrayfilename_test = directory.getfiles(name + "\\test", "*.*", searchoption.alldirectories); arraylabel_test = getlabelarray(arrayfilename_test);
-
获得标签
private int64[] getlabelarray(string[] filesarray) { int64[] arraylabel = new int64[filesarray.length]; for (int i = 0; i < arraylabel.length; i++) { string[] labels = filesarray[i].split('\\'); string label = labels[labels.length - 2]; arraylabel[i] = dict_label.single(k => k.value == label).key; } return arraylabel; }
-
随机乱序
public (string[], int64[]) shufflearray(int count, string[] images, int64[] labels) { arraylist mylist = new arraylist(); string[] new_images = new string[count]; int64[] new_labels = new int64[count]; random r = new random(); for (int i = 0; i < count; i++) { mylist.add(i); } for (int i = 0; i < count; i++) { int rand = r.next(mylist.count); new_images[i] = images[(int)(mylist[rand])]; new_labels[i] = labels[(int)(mylist[rand])]; mylist.removeat(rand); } print("shuffle array list: " + count.tostring()); return (new_images, new_labels); }
部分数据集预先载入
validation/test数据集和标签一次性预先载入成ndarray格式。
private void loadimagestondarray() { //load labels y_valid = np.eye(dict_label.count)[new ndarray(arraylabel_validation)]; y_test = np.eye(dict_label.count)[new ndarray(arraylabel_test)]; print("load labels to ndarray : ok!"); //load images x_valid = np.zeros(arrayfilename_validation.length, img_h, img_w, n_channels); x_test = np.zeros(arrayfilename_test.length, img_h, img_w, n_channels); loadimage(arrayfilename_validation, x_valid, "validation"); loadimage(arrayfilename_test, x_test, "test"); print("load images to ndarray : ok!"); } private void loadimage(string[] a, ndarray b, string c) { for (int i = 0; i < a.length; i++) { b[i] = readtensorfromimagefile(a[i]); console.write("."); } console.writeline(); console.writeline("load images to ndarray: " + c); } private ndarray readtensorfromimagefile(string file_name) { using (var graph = tf.graph().as_default()) { var file_reader = tf.read_file(file_name, "file_reader"); var decodejpeg = tf.image.decode_jpeg(file_reader, channels: n_channels, name: "decodejpeg"); var cast = tf.cast(decodejpeg, tf.float32); var dims_expander = tf.expand_dims(cast, 0); var resize = tf.constant(new int[] { img_h, img_w }); var bilinear = tf.image.resize_bilinear(dims_expander, resize); var sub = tf.subtract(bilinear, new float[] { img_mean }); var normalized = tf.divide(sub, new float[] { img_std }); using (var sess = tf.session(graph)) { return sess.run(normalized); } } }
计算图构建
构建cnn静态计算图,其中学习率每n轮epoch进行1次递减。
#region buildgraph public graph buildgraph() { var graph = new graph().as_default(); tf_with(tf.name_scope("input"), delegate { x = tf.placeholder(tf.float32, shape: (-1, img_h, img_w, n_channels), name: "x"); y = tf.placeholder(tf.float32, shape: (-1, n_classes), name: "y"); }); var conv1 = conv_layer(x, filter_size1, num_filters1, stride1, name: "conv1"); var pool1 = max_pool(conv1, ksize: 2, stride: 2, name: "pool1"); var conv2 = conv_layer(pool1, filter_size2, num_filters2, stride2, name: "conv2"); var pool2 = max_pool(conv2, ksize: 2, stride: 2, name: "pool2"); var layer_flat = flatten_layer(pool2); var fc1 = fc_layer(layer_flat, h1, "fc1", use_relu: true); var output_logits = fc_layer(fc1, n_classes, "out", use_relu: false); //some important parameter saved with graph , easy to load later var img_h_t = tf.constant(img_h, name: "img_h"); var img_w_t = tf.constant(img_w, name: "img_w"); var img_mean_t = tf.constant(img_mean, name: "img_mean"); var img_std_t = tf.constant(img_std, name: "img_std"); var channels_t = tf.constant(n_channels, name: "img_channels"); //learning rate decay gloabl_steps = tf.variable(0, trainable: false); learning_rate = tf.variable(learning_rate_base); //create train images graph tf_with(tf.variable_scope("loadimage"), delegate { decodejpeg = tf.placeholder(tf.@byte, name: "decodejpeg"); var cast = tf.cast(decodejpeg, tf.float32); var dims_expander = tf.expand_dims(cast, 0); var resize = tf.constant(new int[] { img_h, img_w }); var bilinear = tf.image.resize_bilinear(dims_expander, resize); var sub = tf.subtract(bilinear, new float[] { img_mean }); normalized = tf.divide(sub, new float[] { img_std }, name: "normalized"); }); tf_with(tf.variable_scope("train"), delegate { tf_with(tf.variable_scope("loss"), delegate { loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels: y, logits: output_logits), name: "loss"); }); tf_with(tf.variable_scope("optimizer"), delegate { optimizer = tf.train.adamoptimizer(learning_rate: learning_rate, name: "adam-op").minimize(loss, global_step: gloabl_steps); }); tf_with(tf.variable_scope("accuracy"), delegate { var correct_prediction = tf.equal(tf.argmax(output_logits, 1), tf.argmax(y, 1), name: "correct_pred"); accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32), name: "accuracy"); }); tf_with(tf.variable_scope("prediction"), delegate { cls_prediction = tf.argmax(output_logits, axis: 1, name: "predictions"); prob = tf.nn.softmax(output_logits, axis: 1, name: "prob"); }); }); return graph; } /// <summary> /// create a 2d convolution layer /// </summary> /// <param name="x">input from previous layer</param> /// <param name="filter_size">size of each filter</param> /// <param name="num_filters">number of filters(or output feature maps)</param> /// <param name="stride">filter stride</param> /// <param name="name">layer name</param> /// <returns>the output array</returns> private tensor conv_layer(tensor x, int filter_size, int num_filters, int stride, string name) { return tf_with(tf.variable_scope(name), delegate { var num_in_channel = x.shape[x.ndims - 1]; var shape = new[] { filter_size, filter_size, num_in_channel, num_filters }; var w = weight_variable("w", shape); // var tf.summary.histogram("weight", w); var b = bias_variable("b", new[] { num_filters }); // tf.summary.histogram("bias", b); var layer = tf.nn.conv2d(x, w, strides: new[] { 1, stride, stride, 1 }, padding: "same"); layer += b; return tf.nn.relu(layer); }); } /// <summary> /// create a max pooling layer /// </summary> /// <param name="x">input to max-pooling layer</param> /// <param name="ksize">size of the max-pooling filter</param> /// <param name="stride">stride of the max-pooling filter</param> /// <param name="name">layer name</param> /// <returns>the output array</returns> private tensor max_pool(tensor x, int ksize, int stride, string name) { return tf.nn.max_pool(x, ksize: new[] { 1, ksize, ksize, 1 }, strides: new[] { 1, stride, stride, 1 }, padding: "same", name: name); } /// <summary> /// flattens the output of the convolutional layer to be fed into fully-connected layer /// </summary> /// <param name="layer">input array</param> /// <returns>flattened array</returns> private tensor flatten_layer(tensor layer) { return tf_with(tf.variable_scope("flatten_layer"), delegate { var layer_shape = layer.tensorshape; var num_features = layer_shape[new slice(1, 4)].size; var layer_flat = tf.reshape(layer, new[] { -1, num_features }); return layer_flat; }); } /// <summary> /// create a weight variable with appropriate initialization /// </summary> /// <param name="name"></param> /// <param name="shape"></param> /// <returns></returns> private refvariable weight_variable(string name, int[] shape) { var initer = tf.truncated_normal_initializer(stddev: 0.01f); return tf.get_variable(name, dtype: tf.float32, shape: shape, initializer: initer); } /// <summary> /// create a bias variable with appropriate initialization /// </summary> /// <param name="name"></param> /// <param name="shape"></param> /// <returns></returns> private refvariable bias_variable(string name, int[] shape) { var initial = tf.constant(0f, shape: shape, dtype: tf.float32); return tf.get_variable(name, dtype: tf.float32, initializer: initial); } /// <summary> /// create a fully-connected layer /// </summary> /// <param name="x">input from previous layer</param> /// <param name="num_units">number of hidden units in the fully-connected layer</param> /// <param name="name">layer name</param> /// <param name="use_relu">boolean to add relu non-linearity (or not)</param> /// <returns>the output array</returns> private tensor fc_layer(tensor x, int num_units, string name, bool use_relu = true) { return tf_with(tf.variable_scope(name), delegate { var in_dim = x.shape[1]; var w = weight_variable("w_" + name, shape: new[] { in_dim, num_units }); var b = bias_variable("b_" + name, new[] { num_units }); var layer = tf.matmul(x, w) + b; if (use_relu) layer = tf.nn.relu(layer); return layer; }); } #endregion
模型训练和模型保存
-
batch数据集的读取,采用了 sharpcv 的cv2.imread,可以直接读取本地图像文件至ndarray,实现cv和numpy的无缝对接;
-
使用.net的异步线程安全队列blockingcollection<t>,实现tensorflow原生的队列管理器fifoqueue;
-
在训练模型的时候,我们需要将样本从硬盘读取到内存之后,才能进行训练。我们在会话中运行多个线程,并加入队列管理器进行线程间的文件入队出队操作,并限制队列容量,主线程可以利用队列中的数据进行训练,另一个线程进行本地文件的io读取,这样可以实现数据的读取和模型的训练是异步的,降低训练时间。
-
-
模型的保存,可以选择每轮训练都保存,或最佳训练模型保存
#region train public void train(session sess) { // number of training iterations in each epoch var num_tr_iter = (arraylabel_train.length) / batch_size; var init = tf.global_variables_initializer(); sess.run(init); var saver = tf.train.saver(tf.global_variables(), max_to_keep: 10); path_model = name + "\\model"; directory.createdirectory(path_model); float loss_val = 100.0f; float accuracy_val = 0f; var sw = new stopwatch(); sw.start(); foreach (var epoch in range(epochs)) { print($"training epoch: {epoch + 1}"); // randomly shuffle the training data at the beginning of each epoch (arrayfilename_train, arraylabel_train) = shufflearray(arraylabel_train.length, arrayfilename_train, arraylabel_train); y_train = np.eye(dict_label.count)[new ndarray(arraylabel_train)]; //decay learning rate if (learning_rate_step != 0) { if ((epoch != 0) && (epoch % learning_rate_step == 0)) { learning_rate_base = learning_rate_base * learning_rate_decay; if (learning_rate_base <= learning_rate_min) { learning_rate_base = learning_rate_min; } sess.run(tf.assign(learning_rate, learning_rate_base)); } } //load local images asynchronously,use queue,improve train efficiency blockingcollection<(ndarray c_x, ndarray c_y, int iter)> blockc = new blockingcollection<(ndarray c1, ndarray c2, int iter)>(trainqueuecapa); task.run(() => { foreach (var iteration in range(num_tr_iter)) { var start = iteration * batch_size; var end = (iteration + 1) * batch_size; (ndarray x_batch, ndarray y_batch) = getnextbatch(sess, arrayfilename_train, y_train, start, end); blockc.add((x_batch, y_batch, iteration)); } blockc.completeadding(); }); foreach (var item in blockc.getconsumingenumerable()) { sess.run(optimizer, (x, item.c_x), (y, item.c_y)); if (item.iter % display_freq == 0) { // calculate and display the batch loss and accuracy var result = sess.run(new[] { loss, accuracy }, new feeditem(x, item.c_x), new feeditem(y, item.c_y)); loss_val = result[0]; accuracy_val = result[1]; print("cnn:" + ($"iter {item.iter.tostring("000")}: loss={loss_val.tostring("0.0000")}, training accuracy={accuracy_val.tostring("p")} {sw.elapsedmilliseconds}ms")); sw.restart(); } } // run validation after every epoch (loss_val, accuracy_val) = sess.run((loss, accuracy), (x, x_valid), (y, y_valid)); print("cnn:" + "---------------------------------------------------------"); print("cnn:" + $"gloabl steps: {sess.run(gloabl_steps) },learning rate: {sess.run(learning_rate)}, validation loss: {loss_val.tostring("0.0000")}, validation accuracy: {accuracy_val.tostring("p")}"); print("cnn:" + "---------------------------------------------------------"); if (saverbest) { if (accuracy_val > max_accuracy) { max_accuracy = accuracy_val; saver.save(sess, path_model + "\\cnn_best"); print("ckpt model is save."); } } else { saver.save(sess, path_model + string.format("\\cnn_epoch_{0}_loss_{1}_acc_{2}", epoch, loss_val, accuracy_val)); print("ckpt model is save."); } } write_dictionary(path_model + "\\dic.txt", dict_label); } private void write_dictionary(string path, dictionary<int64, string> mydic) { filestream fs = new filestream(path, filemode.create); streamwriter sw = new streamwriter(fs); foreach (var d in mydic) { sw.write(d.key + "," + d.value + "\r\n"); } sw.flush(); sw.close(); fs.close(); print("write_dictionary"); } private (ndarray, ndarray) randomize(ndarray x, ndarray y) { var perm = np.random.permutation(y.shape[0]); np.random.shuffle(perm); return (x[perm], y[perm]); } private (ndarray, ndarray) getnextbatch(ndarray x, ndarray y, int start, int end) { var slice = new slice(start, end); var x_batch = x[slice]; var y_batch = y[slice]; return (x_batch, y_batch); } private unsafe (ndarray, ndarray) getnextbatch(session sess, string[] x, ndarray y, int start, int end) { ndarray x_batch = np.zeros(end - start, img_h, img_w, n_channels); int n = 0; for (int i = start; i < end; i++) { ndarray img4 = cv2.imread(x[i], imread_color.imread_grayscale); x_batch[n] = sess.run(normalized, (decodejpeg, img4)); n++; } var slice = new slice(start, end); var y_batch = y[slice]; return (x_batch, y_batch); } #endregion
测试集预测
-
训练完成的模型对test数据集进行预测,并统计准确率
-
计算图中增加了一个提取预测结果top-1的概率的节点,最后测试集预测的时候可以把详细的预测数据进行输出,方便实际工程中进行调试和优化。
public void test(session sess) { (loss_test, accuracy_test) = sess.run((loss, accuracy), (x, x_test), (y, y_test)); print("cnn:" + "---------------------------------------------------------"); print("cnn:" + $"test loss: {loss_test.tostring("0.0000")}, test accuracy: {accuracy_test.tostring("p")}"); print("cnn:" + "---------------------------------------------------------"); (test_cls, test_data) = sess.run((cls_prediction, prob), (x, x_test)); } private void testdataoutput() { for (int i = 0; i < arraylabel_test.length; i++) { int64 real = arraylabel_test[i]; int predict = (int)(test_cls[i]); var probability = test_data[i, predict]; string result = (real == predict) ? "ok" : "ng"; string filename = arrayfilename_test[i]; string real_str = dict_label[real]; string predict_str = dict_label[predict]; print((i + 1).tostring() + "|" + "result:" + result + "|" + "real_str:" + real_str + "|" + "predict_str:" + predict_str + "|" + "probability:" + probability.getsingle().tostring() + "|" + "filename:" + filename); } }
总结
本文主要是.net下的tensorflow在实际工业现场视觉检测项目中的应用,使用scisharp的tensorflow.net构建了简单的cnn图像分类模型,该模型包含输入层、卷积与池化层、扁平化层、全连接层和输出层,这些层都是cnn分类模型的必要的层,针对工业现场的实际图像进行了分类,分类准确性较高。
完整代码可以直接用于大家自己的数据集进行训练,已经在工业现场经过大量测试,可以在gpu或cpu环境下运行,只需要更换tensorflow.dll文件即可实现训练环境的切换。
同时,训练完成的模型文件,可以使用 “ckpt+meta” 或 冻结成“pb” 2种方式,进行现场的部署,模型部署和现场应用推理可以全部在.net平台下进行,实现工业现场程序的无缝对接。摆脱了以往python下 需要通过flask搭建服务器进行数据通讯交互 的方式,现场部署应用时无需配置python和tensorflow的环境【无需对工业现场的原有pc升级安装一大堆环境】,整个过程全部使用传统的.net的dll引用的方式。
欢迎广大.net开发者们加入tensorflow.net社区,scisharp stack qq群:461855582 ,或有任何问题可以直接联系我的个人qq:50705111 。
scisharp stack qq群:
我的个人qq:
上一篇: 宇文氏为什么会放弃皇后的名分?保住了儿子和娘家人的富贵
下一篇: C#开发OPC客户端
推荐阅读
-
在C#下使用TensorFlow.NET训练自己的数据集
-
C#使用TensorFlow.NET训练自己的数据集的方法
-
python 使用Yolact训练自己的数据集
-
在Linux下使用Pytorch运行yolov3训练自己的数据集初体验
-
C#使用TensorFlow.NET训练自己的数据集的方法
-
在C#下使用TensorFlow.NET训练自己的数据集
-
自己制作机器学习训练和测试使用的二进制数据集(C++)
-
使用keras-retinanet训练自己的数据集
-
PaddleDetection——使用自己制作的VOC数据集进行模型训练的避坑指南
-
detectron2:使用API函数训练自己的coco格式的数据集