Tensorflow中的Eager execution

程序员文章站 2024-01-19 12:45:04

...

本文的主要内容参考斯坦福大学CS20SI课程，有兴趣的同学请点击链接SC20查看课程讲义。

今天我们通常使用的Tensorflow是声明式的（Declarative）。这意味着我们在运行我们的Graph的时候必须提前先声明好其中的所有内容，然后再运行它。

对于图，它是...

可优化的（Optimizable）

-自动缓冲区重用（automatic buffer reuse）

-可以不断折叠的（constant folding）

-op之间是并行处理的（inter-op parallelism）

-自动在计算和内存资源之间进行权衡（automatic trade-off between compute and memory）

可展开的（Deployable）

-图是一个对于刻画一个模型的中介。

可重写的（Rewritable）

-experiment with automatic device placement or quantization.

但是，图也是...

难以调试的（Difficult to debug）

-在组成图后，如果有问题会报告很长的错误。

-不能通过pdb或者打印状态来对图的执行进行调试。

不够Python(Un-Pythonic)

-编写TensorFlow程序是一个元编程（metaprogramming）的练习。

-Tensorflow控制流和Python有很大的不同。

-不能用传统的数据结构来完成图的构建。

所以，为了解决图具有的这一系列的问题。Tensorflow的开发者引入了Eager execution.

"A NumPy-like library for numerical computation with support for GPU acceleration and automatic differentiation, and a flexible platform for machine learning research and experimentation."

----the eager execution user guide

一个调用eager execution的demo:

$python
import tensorflow # version >= 1.50
import tensorflow.contrib.eager as tfe
tfe.enable_eager_execution()

重要的优点：

与Python的调试工具兼容

你终于可以使用pbd.set_trace()了！

提供及时的错误反馈
允许使用Python的数据结构
可以使用Python的控制流。诸如：if语句，for循环，递归等等。

使用Eager execution 可以使你的代码变得更简洁

你再也不需要担心...

1.占位符（placeholders）

2.sessions

3.控制依赖（control dependencies）

4.lazy loading

5.{name, variable, op}

一些对比

使用eager execution 前：

在这里我们实现了一个矩阵和自身相乘的操作。

x = tf.placeholder(tf.float32, shape=[1, 1])
m = tf.matmul(x, x)

print(m)
# Tensor("MatMul:0", shape=(1, 1), dtype=float32)

with tf.Session() as sess:
  m_out = sess.run(m, feed_dict={x: [[2.]]})
print(m_out)
# [[4.]]

使用eager execution后：

x = [[2.]]  # No need for placeholders!
m = tf.matmul(x, x)

print(m)  # No sessions!
# tf.Tensor([[4.]], shape=(1, 1), dtype=float32)

我们看到在我们使用eager execution后，三行代码就足以让我们完成之前的任务。没有placeholder，没有session，这极大的简化了我们的代码。

对于Lazy loading:

x = tf.random_uniform([2, 2])

with tf.Session() as sess:
  for i in range(x.shape[0]):
    for j in range(x.shape[1]):
      print(sess.run(x[i, j]))

在这个操作中，我们会在每次迭代时都要向图中添加一个op。而当我们在使用eager execution 时，由于我们不再需要图或者对一个op进行重复的操作，因此我们的代码会变得更加简洁，如下：

x = tf.random_uniform([2, 2])

for i in range(x.shape[0]):
  for j in range(x.shape[1]):
    print(x[i, j])

另外，我们在这里介绍一个小技巧，即如何让Tensors像Numpy数组一样，下面是一个小实例：

x = tf.constant([1.0, 2.0, 3.0])


# Tensors are backed by NumPy arrays
assert type(x.numpy()) == np.ndarray
squared = np.square(x) # Tensors are compatible with NumPy functions
 
# Tensors are iterable!
for i in x:
  print(i)

梯度

在eager execution 中已经构建了微分的方法。

在这一框架下...

op是被记录在一个tape上
这个tape会被回放从而可以用来计算梯度。（这种操作属于反向传播。）

当我们在使用eager execution时，被执行的ops会被记录到一个tape上，从而可以通过回放来计算梯度。如果你熟悉autograd包，那么这种方式和它API很相似。

举个例子来说：

def square(x):
  return x ** 2

grad = tfe.gradients_function(square)

print(square(3.))    # tf.Tensor(9., shape=(), dtype=float32)
print(grad(3.))      # [tf.Tensor(6., shape=(), dtype=float32))]

其中，tfe.gradients_function()会根据输入函数的不同而表现出不同的形式。再比如：

x = tfe.Variable(2.0)
def loss(y):
  return (y - x ** 2) ** 2

grad = tfe.implicit_gradients(loss)

print(loss(7.))  # tf.Tensor(9., shape=(), dtype=float32)
print(grad(7.))  # [(<tf.Tensor: -24.0, shape=(), dtype=float32>, 
                     <tf.Variable 'Variable:0' shape=()                
                      dtype=float32, numpy=2.0>)]

当我们使用eager execution时，需要使用tfe.Variable来声明变量。同样的，tfe.implicit_gradients()会根据变量来计算梯度。

下面的API均可以被用来计算梯度，即使当eager execution 没有被使用。

tfe.gradients_function()
tfe.value_and_gradients_function()
tfe.implicit_gradients()
tfe.implicit_value_and_gradients()

使用Eager Execution 的Huber回归

和没有Eager Execution的模式相比，没有那么多的不同。

一系列op的集合

Tensorflow = Operation Kernels + Execution

构建图的模式：使用Session来执行一系列op的组合。

Eager execution 模式：用Python来执行一系列op的组合。

对于Tensorflow ,一种可以用来理解的方式是可以将它视为一系列operation的组合，这些operation包括数学，线性代数，图像处理，用来生成TensorBoard可视化的代码等等，也包括执行这些组成部分的一个计算操作。Session提供了一种执行这些op的方法。而在Eager execution模式下，相当于是使用python直接执行这些操作。

但是二者基本的操作是相同的，因此API的形式也大体相当。

一般情况下，无论你是否启用了eager execution,Tensorflow的API都是可以使用的。但是当eager execution 模式被启用的情况下......

更倾向于推荐使用tfe.Variable来定义变量，这样有助于实现构建图时候的兼容性。
你需要管理好你自己的变量存储，在这种情况下，变量集合是不被支持的。
请使用tfe.contrib.summary
请使用tfe.Iterator来作为在eager execution模式下用于迭代处理数据集的迭代器。
更倾向于使用面向对象的层（例如tf.layers.Dense）

-只有当功能层（例如tf.layers.dense）包装进tfe.make_template的时候才能发挥功效。

如果我喜欢图呢？

必须要声明并且返回。

模型只需要定义一次。

-相同的代码能够在一个Python进程中执行op,同时能够在另外一个进程中组成一个图。

Checkpoints是兼容的。

-Train eagerly, checkpoint, load in a graph, or vice-versa.

在eager execution 模式下创建图

-tfe.defun:将“Complie”编译成图然后再执行。

所以，我该什么时候使用eager execution呢？

如果你是一个想使用灵活框架的研究者，或者想要开发一个新的机器学习模型，或者是Tensorflow的初学者，我们都很推荐你去使用eager exexecution。

上一篇：算法学习(五)---队列

下一篇： tensorflow eager execution

Tensorflow中的Eager execution

使用Eager execution 可以使你的代码变得更简洁

一些对比

梯度

使用Eager Execution 的Huber回归

一系列op的集合

如果我喜欢图呢？

所以，我该什么时候使用eager execution呢？

tensorflow eager execution

TensorFlow实践（16）——tf.enable_eager_execution方法

Tensorflow Eager Execution

tensorflow之Eager execution基础

Tensorflow中的Eager execution

Tensorflow学习——Eager Execution

TensorFlow2.0的Eager模式

(tensorflow之二十)TensorFlow Eager Execution立即执行插件

tensorflow2.x 下 eager mode的训练流程

tensorflow eager 模式下打印dataset中的数据