欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  互联网

深度学习-循环神经网络(RNN)

程序员文章站 2022-03-27 12:40:34
作者: 明天依旧可好QQ交流群: 807041986最新更新时间: 2020-12-11注:关于深度学习的相关问题,若本文未涉及可在下方留言告诉我,我会在文章中进行补充的。原文链接:深度学习系列:深度学习(TensorFlow 2)简单入门代码|数据:微信公众号(明天依旧可好)中回复:深度学习导入数据import pandas as pdimport tensorflow as tfimport osdf = pd.read_csv("Tweets.csv",usecols=["a...

作者: 明天依旧可好
QQ交流群: 807041986

注:关于深度学习的相关问题,若本文未涉及可在下方留言告诉我,我会在文章中进行补充的。

原文链接:https://mtyjkh.blog.csdn.net/article/details/111088248
深度学习系列:深度学习(TensorFlow 2)简单入门
代码|数据: 微信公众号(明天依旧可好)中回复:深度学习


导入数据

import pandas as pd
import tensorflow as tf
import os

df = pd.read_csv("Tweets.csv",usecols=["airline_sentiment","text"])
df

深度学习-循环神经网络(RNN)

# categorical  实际上是计算一个列表型数据中的类别数,即不重复项,
# 它返回的是一个CategoricalDtype 类型的对象,相当于在原来数据上附加上类别信息 ,
# 具体的类别可以通过和对应的序号可以通过  codes  和 categories 
df.airline_sentiment = pd.Categorical(df.airline_sentiment).codes
df

深度学习-循环神经网络(RNN)

建立词汇表

import tensorflow_datasets as tfds
import os

tokenizer = tfds.features.text.Tokenizer()

vocabulary_set = set()
for text in df["text"]:
    some_tokens = tokenizer.tokenize(text)
    vocabulary_set.update(some_tokens)

vocab_size = len(vocabulary_set)
vocab_size
'''
输出:
18027
'''

样本编码(测试)

encoder = tfds.features.text.TokenTextEncoder(vocabulary_set)
encoded_example = encoder.encode(text)
print(encoded_example)
'''
text为:
'@AmericanAir we have 8 ppl so we need 2 know how many seats are on the next flight. Plz put us on standby for 4 people on the next flight?'
输出:
[12939, 13052, 13579, 11267, 14825, 8674, 13052, 12213, 12082, 12156, 5329, 5401, 10099, 3100, 7974, 7804, 5671, 2947, 9873, 7864, 9704, 7974, 3564, 11759, 15266, 11250, 7974, 7804, 5671, 2947]
'''

将文本编码成数字形式

df["encoded_text"] = [encoder.encode(text) for text in df["text"]]
df

深度学习-循环神经网络(RNN)

train_x = df["encoded_text"][:10000]
train_y = df["airline_sentiment"][:10000]
test_x = df["encoded_text"][10000:]
test_y = df["airline_sentiment"][10000:]

from tensorflow import keras
train_x = keras.preprocessing.sequence.pad_sequences(train_x,maxlen=50)
test_x = keras.preprocessing.sequence.pad_sequences(test_x,maxlen=50)

train_x.shape,train_y.shape,test_x.shape,test_y.shape
'''
输出:
((10000, 50), (10000,), (4640, 50), (4640,))
'''

构建模型

import tensorflow as tf

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size+1, 64),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(64)),
    tf.keras.layers.Dense(64, activation='relu'),
    tf.keras.layers.Dense(1)
])

model.summary()
'''
输出:
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
embedding_2 (Embedding)      (None, None, 64)          1153792   
_________________________________________________________________
bidirectional_2 (Bidirection (None, 128)               66048     
_________________________________________________________________
dense_4 (Dense)              (None, 64)                8256      
_________________________________________________________________
dense_5 (Dense)              (None, 1)                 65        
=================================================================
Total params: 1,228,161
Trainable params: 1,228,161
Non-trainable params: 0
_________________________________________________________________
'''

激活

model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
              optimizer=tf.keras.optimizers.Adam(1e-4),
              metrics=['accuracy'])

训练模型

history = model.fit(train_x,
                    train_y,
                    epochs=20,
                    batch_size=200,
                    validation_data=(test_x, test_y),
                    verbose=1)
'''
输出:
Epoch 1/20
50/50 [==============================] - 6s 117ms/step - loss: -4.8196 - accuracy: 0.6652 - val_loss: -0.7605 - val_accuracy: 0.7071
......
Epoch 19/20
50/50 [==============================] - 6s 123ms/step - loss: -37.5176 - accuracy: 0.7586 - val_loss: -9.0619 - val_accuracy: 0.7272
Epoch 20/20
50/50 [==============================] - 6s 120ms/step - loss: -40.0017 - accuracy: 0.7611 - val_loss: -7.7479 - val_accuracy: 0.7248

'''

本文地址:https://blog.csdn.net/qq_38251616/article/details/111088248