【图神经网络】ChebyNet-切比雪夫多项式近似图卷积核
本文为图神经网络学习笔记,讲解 ChebyNet-切比雪夫多项式近似图卷积核。欢迎在评论区与我交流????
ChebyNet 简介
见【图卷积网络】。
ChebyNet 实现
对图的邻接矩阵进行归一化处理得到拉普拉斯矩阵。归一化方法有:
{
L
=
D
−
A
L
s
y
m
=
D
−
1
/
2
L
D
−
1
/
2
L
r
w
=
D
−
1
L
\left\{ \begin{array}{rcl} L=D-A \\ L^{sym}=D^{-1/2}LD^{-1/2}\\ L^{rw}=D^{-1}L \end{array} \right.
⎩⎨⎧L=D−ALsym=D−1/2LD−1/2Lrw=D−1L
根据得到的归一化拉普拉斯矩阵计算:
L
^
=
2
λ
m
a
x
L
−
I
N
\hat{L}=\frac{2}{\lambda_{max}}L-I_N
L^=λmax2L−IN
Re-scaled 特征值对角矩阵,将其变换到
[
−
1
,
1
]
[-1,1]
[−1,1] 之间:
num_nodes = x.shape[0]
norm_edge_index, norm_edge_weight = chebnet_norm_edge(edge_index, num_nodes, edge_weight, lambda_max, normalization_type=normalization_type)
利用切比雪夫多项式的迭代定义递推计算高阶项(节省大量运算),最后输出模型结果,即多项式和 y = σ ( ∑ k = 0 K θ k T k ( L ^ ) ( x ) ) y=\sigma(\sum\limits_{k=0}^K\theta_kT_k(\hat{L})(x)) y=σ(k=0∑KθkTk(L^)(x)) 计算损失或评估模型效果:
T0_x = x
T1_x = x
out = tf.matmul(T0_x, kernel[0]) # 两个矩阵相乘
if K > 1:
T1_x = aggregate_neighbors(x, norm_edge_index, norm_edge_weight, gcn_mapper, sum_reducer, identity_updater)
out += tf.matmul(T1_x, kernel[1])
# T_{n+1}=2T_n-T_{n-1}
for i in range(2, K):
T2_x = aggregate_neighbors(T1_x, norm_edge_index, norm_edge_weight, gcn_mapper, sum_reducer, identity_updater) # L^T_{k-1}(L^)
T2_x = 2.0 * T2_x - T0_x
out += tf.matmul(T2_x, kernel[i])
T0_x, T1_x = T1_x, T2_x
if bias is not None:
out += bias
if activation is not None:
out += activation(out)
return out
模型构建
本教程使用的核心库是 tf_geometric,我们用它来进行图数据导入、图数据预处理及图神经网络构建。ChebNet 的具体实现已经在上面详细介绍,LaplacianMaxEigenvalue
获取拉普拉斯矩阵的最大特征值。后面使用 keras.metrics.Accuracy
评估模型性能:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "1"
import tensorflow as tf
import numpy as np
from tensorflow import keras
from tf_geometric.layers.conv.chebnet import chebNet
from tf_geometric.datasets.cora import CoraDataset
from tf_geometric.utils.graph_utils import LaplacianMaxEigenvalue
from tqdm import tqdm
使用 tf_geometric 自带的图结构数据接口加载 Cora 数据集:
# 加载 Cora 数据集
graph, (train_index, valid_index, test_index) = CoraDataset().load_data()
获取图拉普拉斯矩阵的最大特征值:
# 获取 lambda_max
graph_lambda_max = LaplacianMaxEigenvalue(graph.x, graph.edge_index, graph.edge_weight)
定义模型,引入 keras.layers
中的 Dropout 层随机关闭神经元缓解过拟合。由于 Dropout 层在训练和预测阶段的状态不同,通过参数 training 来决定是否需要 Dropout 发挥作用:
model = chebNet(64, K=3, lambda_max=graph_lambda_max()
fc = tf.keras.Sequential([
keras.layers.Dropout(0.5), # Dropout 层随机关闭神经元缓解过拟合
keras.layers.Dense(num_classes)])
def forward(graph, training=False):
h = model([graph.x, graph.edge_index, graph.edge_weight])
h = fc(h, training=training) # 通过参数 training 来决定是否需要 Dropout 发挥作用
return h
ChebyNet 训练
模型的训练与其他基于 Tensorflow 框架的模型训练基本一致,主要步骤有定义优化器,计算误差与梯度,反向传播等,然后分别计算验证集和测试集上的准确率:
# 定义优化器
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
best_test_acc = tmp_valid_acc = 0
for step in tqdm(range(1, 101)):
with tf.GradientTape() as tape:
# 前向传播
logits = forward(graph, training=True)
# 计算损失
loss = compute_loss(logits, train_index, tape.watched_variables())
vars = tape.watched_variables()
grads = tape.gradient(loss, vars) # 计算梯度
optimizer.apply_gradients(zip(grads, vars)) # 梯度下降优化
valid_acc = evaluate(valid_index) # 计算验证集
test_acc = evaluate(test_index) # 计算测试集
if test_acc > best_test_acc:
best_test_acc = test_acc
tmp_valid_acc = valid_acc
print("step = {}\tloss = {}\tvalid_acc = {}\tbest_test_acc = {}".format(step, loss, tmp_valid_acc, best_test_acc))
用交叉熵损失函数计算模型损失。注意在加载 Cora 数据集时,返回值是整个图数据以及相应的 train_index
、valid_index
、test_index
。TAGCN 在训练时输入整个Graph,计算损失时通过 train_index
计算模型在训练集上的迭代损失。因此,此时传入的 mask_index
是 train_index
。由于是多分类任务,需要将节点的标签转换为 one-hot 向量以便与模型输出的结果维度对应。由于图神经模型在小数据集上很容易过拟合,所以这里用
L
2
L_2
L2 正则化缓解过拟合:
def compute_loss(logits, mask_index, vars):
masked_logits = tf.gather(logits, mask_index) # 前向传播(预测)的结果,取训练数据部分
masked_labels = tf.gather(graph.y, mask_index) # 真实结果,取训练数据部分
losses = tf.nn.softmax_cross_entropy_with_logits(
logits=masked_logits, # 预测结果
labels=tf.one_hot(masked_labels, depth=num_classes) # 真实结果,即标签
)
# 用 L_2 正则化缓解过拟合
kernel_vals = [var for var in vars if "kernel" in var.name]
l2_losses = [tf.nn.l2_loss(kernel_var) for kernel_var in kernel_vals]
# reduce_mean 计算张量的平均值;tf.add_n 列表对应元素相加
return tf.reduce_mean(losses) + tf.add_n(l2_losses) * 5e-4
ChebyNet 评估
评估模型性能时只需传入 valid_mask
或 test_mask
,通过 tf.gather
函数可以拿出验证集或测试集在模型上的预测结果与真实标签,用 keras自带的 keras.metrics.Accuracy
计算准确率:
def evaluate(mask):
logits = forward(graph) # 前向传播结果
logits = tf.nn.log_softmax(logits, axis=-1) # 假设函数处理
masked_logits = tf.gather(logits, mask) # 预测结果
masked_labels = tf.gather(graph.y, mask) # 真实标签
# 返回预测结果向量最大值的索引
y_pred = tf.argmax(masked_logits, axis=-1, output_type=tf.int32)
accuracy_m = keras.metrics.Accuracy()
accuracy_m.update_state(masked_labels, y_pred)
return accuracy_m.result().numpy() # 准确度结果转换为 numpy 返回
运行结果
0%| | 0/100 [00:00<?, ?it/s]step = 1 loss = 1.9817407131195068 valid_acc = 0.7139999866485596 best_test_acc = 0.7089999914169312
2%|▏ | 2/100 [00:01<00:55, 1.76it/s]step = 2 loss = 1.6069653034210205 valid_acc = 0.75 best_test_acc = 0.7409999966621399
step = 3 loss = 1.2625869512557983 valid_acc = 0.7720000147819519 best_test_acc = 0.7699999809265137
4%|▍ | 4/100 [00:01<00:48, 1.98it/s]step = 4 loss = 0.9443040490150452 valid_acc = 0.7760000228881836 best_test_acc = 0.7749999761581421
5%|▌ | 5/100 [00:02<00:46, 2.06it/s]step = 5 loss = 0.7023431062698364 valid_acc = 0.7760000228881836 best_test_acc = 0.7770000100135803
...
96 loss = 0.0799005851149559 valid_acc = 0.7940000295639038 best_test_acc = 0.8080000281333923
96%|█████████▌| 96/100 [00:43<00:01, 2.31it/s]step = 97 loss = 0.0768655389547348 valid_acc = 0.7940000295639038 best_test_acc = 0.8080000281333923
97%|█████████▋| 97/100 [00:43<00:01, 2.33it/s]step = 98 loss = 0.0834992527961731 valid_acc = 0.7940000295639038 best_test_acc = 0.8080000281333923
99%|█████████▉| 99/100 [00:44<00:00, 2.34it/s]step = 99 loss = 0.07315651327371597 valid_acc = 0.7940000295639038 best_test_acc = 0.8080000281333923
100%|██████████| 100/100 [00:44<00:00, 2.23it/s]
step = 100 loss = 0.07698118686676025 valid_acc = 0.7940000295639038 best_test_acc = 0.8080000281333923
完整代码见【demo_chebynet.py】。
有帮助的话点个赞加关注吧 ????
上一篇: Opencv视觉学习--读取、显示视频
下一篇: vue父子组件通信实现过程