欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

ML.NET速览

程序员文章站 2022-05-11 14:28:14
什么是ML.NET? ML.NET是由微软创建,为.NET开发者准备的开源机器学习框架。它是跨平台的,可以在macOS,Linux及Windows上运行。 机器学习管道 ML.NET通过管道(pipeline)方式组合机器学习过程。整个管道分为以下四个部分: Load Data 加载数据 Trans ......

什么是ml.net?

ml.net是由微软创建,为.net开发者准备的开源机器学习框架。它是跨平台的,可以在macos,linux及windows上运行。

机器学习管道

ml.net通过管道(pipeline)方式组合机器学习过程。整个管道分为以下四个部分:

  • load data 加载数据
  • transform data 转换数据
  • choose algorithm 选择算法
  • train model 训练模型

示例

建立一个控制台项目。

dotnet new console -o myapp
cd myapp

添加ml.net类库包。

dotnet add package microsoft.ml

在工程文件夹下创建一个名为iris-data.txt的文本文件,内容如下:

5.1,3.5,1.4,0.2,iris-setosa
4.9,3.0,1.4,0.2,iris-setosa
4.7,3.2,1.3,0.2,iris-setosa
4.6,3.1,1.5,0.2,iris-setosa
5.0,3.6,1.4,0.2,iris-setosa
5.4,3.9,1.7,0.4,iris-setosa
4.6,3.4,1.4,0.3,iris-setosa
5.0,3.4,1.5,0.2,iris-setosa
4.4,2.9,1.4,0.2,iris-setosa
4.9,3.1,1.5,0.1,iris-setosa
5.4,3.7,1.5,0.2,iris-setosa
4.8,3.4,1.6,0.2,iris-setosa
4.8,3.0,1.4,0.1,iris-setosa
4.3,3.0,1.1,0.1,iris-setosa
5.8,4.0,1.2,0.2,iris-setosa
5.7,4.4,1.5,0.4,iris-setosa
5.4,3.9,1.3,0.4,iris-setosa
5.1,3.5,1.4,0.3,iris-setosa
5.7,3.8,1.7,0.3,iris-setosa
5.1,3.8,1.5,0.3,iris-setosa
5.4,3.4,1.7,0.2,iris-setosa
5.1,3.7,1.5,0.4,iris-setosa
4.6,3.6,1.0,0.2,iris-setosa
5.1,3.3,1.7,0.5,iris-setosa
4.8,3.4,1.9,0.2,iris-setosa
5.0,3.0,1.6,0.2,iris-setosa
5.0,3.4,1.6,0.4,iris-setosa
5.2,3.5,1.5,0.2,iris-setosa
5.2,3.4,1.4,0.2,iris-setosa
4.7,3.2,1.6,0.2,iris-setosa
4.8,3.1,1.6,0.2,iris-setosa
5.4,3.4,1.5,0.4,iris-setosa
5.2,4.1,1.5,0.1,iris-setosa
5.5,4.2,1.4,0.2,iris-setosa
4.9,3.1,1.5,0.1,iris-setosa
5.0,3.2,1.2,0.2,iris-setosa
5.5,3.5,1.3,0.2,iris-setosa
4.9,3.1,1.5,0.1,iris-setosa
4.4,3.0,1.3,0.2,iris-setosa
5.1,3.4,1.5,0.2,iris-setosa
5.0,3.5,1.3,0.3,iris-setosa
4.5,2.3,1.3,0.3,iris-setosa
4.4,3.2,1.3,0.2,iris-setosa
5.0,3.5,1.6,0.6,iris-setosa
5.1,3.8,1.9,0.4,iris-setosa
4.8,3.0,1.4,0.3,iris-setosa
5.1,3.8,1.6,0.2,iris-setosa
4.6,3.2,1.4,0.2,iris-setosa
5.3,3.7,1.5,0.2,iris-setosa
5.0,3.3,1.4,0.2,iris-setosa
7.0,3.2,4.7,1.4,iris-versicolor
6.4,3.2,4.5,1.5,iris-versicolor
6.9,3.1,4.9,1.5,iris-versicolor
5.5,2.3,4.0,1.3,iris-versicolor
6.5,2.8,4.6,1.5,iris-versicolor
5.7,2.8,4.5,1.3,iris-versicolor
6.3,3.3,4.7,1.6,iris-versicolor
4.9,2.4,3.3,1.0,iris-versicolor
6.6,2.9,4.6,1.3,iris-versicolor
5.2,2.7,3.9,1.4,iris-versicolor
5.0,2.0,3.5,1.0,iris-versicolor
5.9,3.0,4.2,1.5,iris-versicolor
6.0,2.2,4.0,1.0,iris-versicolor
6.1,2.9,4.7,1.4,iris-versicolor
5.6,2.9,3.6,1.3,iris-versicolor
6.7,3.1,4.4,1.4,iris-versicolor
5.6,3.0,4.5,1.5,iris-versicolor
5.8,2.7,4.1,1.0,iris-versicolor
6.2,2.2,4.5,1.5,iris-versicolor
5.6,2.5,3.9,1.1,iris-versicolor
5.9,3.2,4.8,1.8,iris-versicolor
6.1,2.8,4.0,1.3,iris-versicolor
6.3,2.5,4.9,1.5,iris-versicolor
6.1,2.8,4.7,1.2,iris-versicolor
6.4,2.9,4.3,1.3,iris-versicolor
6.6,3.0,4.4,1.4,iris-versicolor
6.8,2.8,4.8,1.4,iris-versicolor
6.7,3.0,5.0,1.7,iris-versicolor
6.0,2.9,4.5,1.5,iris-versicolor
5.7,2.6,3.5,1.0,iris-versicolor
5.5,2.4,3.8,1.1,iris-versicolor
5.5,2.4,3.7,1.0,iris-versicolor
5.8,2.7,3.9,1.2,iris-versicolor
6.0,2.7,5.1,1.6,iris-versicolor
5.4,3.0,4.5,1.5,iris-versicolor
6.0,3.4,4.5,1.6,iris-versicolor
6.7,3.1,4.7,1.5,iris-versicolor
6.3,2.3,4.4,1.3,iris-versicolor
5.6,3.0,4.1,1.3,iris-versicolor
5.5,2.5,4.0,1.3,iris-versicolor
5.5,2.6,4.4,1.2,iris-versicolor
6.1,3.0,4.6,1.4,iris-versicolor
5.8,2.6,4.0,1.2,iris-versicolor
5.0,2.3,3.3,1.0,iris-versicolor
5.6,2.7,4.2,1.3,iris-versicolor
5.7,3.0,4.2,1.2,iris-versicolor
5.7,2.9,4.2,1.3,iris-versicolor
6.2,2.9,4.3,1.3,iris-versicolor
5.1,2.5,3.0,1.1,iris-versicolor
5.7,2.8,4.1,1.3,iris-versicolor
6.3,3.3,6.0,2.5,iris-virginica
5.8,2.7,5.1,1.9,iris-virginica
7.1,3.0,5.9,2.1,iris-virginica
6.3,2.9,5.6,1.8,iris-virginica
6.5,3.0,5.8,2.2,iris-virginica
7.6,3.0,6.6,2.1,iris-virginica
4.9,2.5,4.5,1.7,iris-virginica
7.3,2.9,6.3,1.8,iris-virginica
6.7,2.5,5.8,1.8,iris-virginica
7.2,3.6,6.1,2.5,iris-virginica
6.5,3.2,5.1,2.0,iris-virginica
6.4,2.7,5.3,1.9,iris-virginica
6.8,3.0,5.5,2.1,iris-virginica
5.7,2.5,5.0,2.0,iris-virginica
5.8,2.8,5.1,2.4,iris-virginica
6.4,3.2,5.3,2.3,iris-virginica
6.5,3.0,5.5,1.8,iris-virginica
7.7,3.8,6.7,2.2,iris-virginica
7.7,2.6,6.9,2.3,iris-virginica
6.0,2.2,5.0,1.5,iris-virginica
6.9,3.2,5.7,2.3,iris-virginica
5.6,2.8,4.9,2.0,iris-virginica
7.7,2.8,6.7,2.0,iris-virginica
6.3,2.7,4.9,1.8,iris-virginica
6.7,3.3,5.7,2.1,iris-virginica
7.2,3.2,6.0,1.8,iris-virginica
6.2,2.8,4.8,1.8,iris-virginica
6.1,3.0,4.9,1.8,iris-virginica
6.4,2.8,5.6,2.1,iris-virginica
7.2,3.0,5.8,1.6,iris-virginica
7.4,2.8,6.1,1.9,iris-virginica
7.9,3.8,6.4,2.0,iris-virginica
6.4,2.8,5.6,2.2,iris-virginica
6.3,2.8,5.1,1.5,iris-virginica
6.1,2.6,5.6,1.4,iris-virginica
7.7,3.0,6.1,2.3,iris-virginica
6.3,3.4,5.6,2.4,iris-virginica
6.4,3.1,5.5,1.8,iris-virginica
6.0,3.0,4.8,1.8,iris-virginica
6.9,3.1,5.4,2.1,iris-virginica
6.7,3.1,5.6,2.4,iris-virginica
6.9,3.1,5.1,2.3,iris-virginica
5.8,2.7,5.1,1.9,iris-virginica
6.8,3.2,5.9,2.3,iris-virginica
6.7,3.3,5.7,2.5,iris-virginica
6.7,3.0,5.2,2.3,iris-virginica
6.3,2.5,5.0,1.9,iris-virginica
6.5,3.0,5.2,2.0,iris-virginica
6.2,3.4,5.4,2.3,iris-virginica
5.9,3.0,5.1,1.8,iris-virginica

粘贴下面的代码到program文件中。

using system;
using microsoft.ml;
using microsoft.ml.runtime.api;
using microsoft.ml.runtime.data;

namespace myapp
{
    class program
    {
        public class irisdata
        {
            public float sepallength;
            public float sepalwidth;
            public float petallength;
            public float petalwidth;
            public string label;
        }

        public class irisprediction
        {
            [columnname("predictedlabel")]
            public string predictedlabels;
        }

        static void main(string[] args)
        {
            var mlcontext = new mlcontext();

            string datapath = "iris-data.txt";
            var reader = mlcontext.data.textreader(new textloader.arguments()
            {
                separator = ",",
                hasheader = true,
                column = new[]
                {
                    new textloader.column("sepallength", datakind.r4, 0),
                    new textloader.column("sepalwidth", datakind.r4, 1),
                    new textloader.column("petallength", datakind.r4, 2),
                    new textloader.column("petalwidth", datakind.r4, 3),
                    new textloader.column("label", datakind.text, 4)
                }
            });

            idataview trainingdataview = reader.read(new multifilesource(datapath));

            var pipeline = mlcontext.transforms.categorical.mapvaluetokey("label")
                .append(mlcontext.transforms.concatenate("features", "sepallength", "sepalwidth", "petallength", "petalwidth"))
                .append(mlcontext.multiclassclassification.trainers.stochasticdualcoordinateascent(label: "label", features: "features"))
                .append(mlcontext.transforms.conversion.mapkeytovalue("predictedlabel"));

            var model = pipeline.fit(trainingdataview);

            var prediction = model.makepredictionfunction<irisdata, irisprediction>(mlcontext).predict(
                new irisdata()
                {
                    sepallength = 3.3f,
                    sepalwidth = 1.6f,
                    petallength = 0.2f,
                    petalwidth = 5.1f,
                });

            console.writeline($"predicted flower type is: {prediction.predictedlabels}");
        }
    }
}

通过dotnet run命令运行程序后可得到预测结果。

predicted flower type is: iris-virginica

解例

例子中定义了两个类,irisdata与irisprediction。irisdata类是用于训练的数据结构,而irisprediction则用于预测。

mlcontext类用于定义ml.net的上下文(context),可以理解为是它的运行时环境。

接着,创建一个textreader,用于读取数据集文件,可以看到其中规定了读取的格式。这里即是机器学习管道的第一步。

第二步,转换irisdata类中label属性的类型,使之成为数值类型,因为只有数值类型的数据才能在模型训练中被使用。再将sepallength,sepalwidth,petallength与petalwidth合并为一,统合为数据集的features。

第三步,为训练选择合适的算法,并传入标签(label)和特征(features)。

第四步,训练模型。

完成模型后,就可以用它进行预测了。因为最后预测的结果是字符串类型,所以在上述第三步的操作后有必要加上转换操作,把结果从数值类型再转回字符串类型。