欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页  >  IT编程

张量限幅

程序员文章站 2022-03-07 09:58:42
[TOC] Outline clip_by_value relu clip_by_norm gradient clipping clip_by_value relu clip_by_norm 缩放时不改变梯度方向 gradient clipping Gradient Exploding or van ......

目录

outline

  • clip_by_value

  • relu

  • clip_by_norm

  • gradient clipping

clip_by_value

import tensorflow as tf
a = tf.range(10)
a
<tf.tensor: id=3, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)>
# a中小于2的元素值为2
tf.maximum(a, 2)
<tf.tensor: id=6, shape=(10,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)>
# a中大于8的元素值为8
tf.minimum(a, 8)
<tf.tensor: id=9, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 8], dtype=int32)>
# a中的元素值限制在[2,8]区间内
tf.clip_by_value(a, 2, 8)
<tf.tensor: id=14, shape=(10,), dtype=int32, numpy=array([2, 2, 2, 3, 4, 5, 6, 7, 8, 8], dtype=int32)>

relu

a = a - 5
a
<tf.tensor: id=17, shape=(10,), dtype=int32, numpy=array([-5, -4, -3, -2, -1,  0,  1,  2,  3,  4], dtype=int32)>
tf.nn.relu(a)
<tf.tensor: id=19, shape=(10,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4], dtype=int32)>
tf.maximum(a, 0)
<tf.tensor: id=22, shape=(10,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4], dtype=int32)>

clip_by_norm

  • 缩放时不改变梯度方向
a = tf.random.normal([2, 2], mean=10)
a
<tf.tensor: id=35, shape=(2, 2), dtype=float32, numpy=
array([[ 8.630464, 10.737844],
       [ 9.764073, 10.382202]], dtype=float32)>
tf.norm(a)
<tf.tensor: id=41, shape=(), dtype=float32, numpy=19.822044>
# 等比例的放缩a, norm为15
aa = tf.clip_by_norm(a, 15)
aa
<tf.tensor: id=58, shape=(2, 2), dtype=float32, numpy=
array([[6.5309587, 8.125684 ],
       [7.388799 , 7.8565574]], dtype=float32)>
tf.norm(aa)
<tf.tensor: id=64, shape=(), dtype=float32, numpy=15.0>

gradient clipping

  • gradient exploding or vanishing

  • set lr=1

  • new_grads,total_norm = tf.clip_by_global_norm(grads,25)

  • 裁剪所有向量,但是所有向量的梯度方向都不变化