tensorflow estimator 使用hook实现finetune方式

程序员文章站 2022-10-11 20:42:16

为了实现finetune有如下两种解决方案： model_fn里面定义好模型之后直接赋值 def model_fn(features, labels, mode, par...

为了实现finetune有如下两种解决方案：

model_fn里面定义好模型之后直接赋值

 def model_fn(features, labels, mode, params):
 # .....
 # finetune
 if params.checkpoint_path and (not tf.train.latest_checkpoint(params.model_dir)):
 checkpoint_path = none
 if tf.gfile.isdirectory(params.checkpoint_path):
  checkpoint_path = tf.train.latest_checkpoint(params.checkpoint_path)
 else:
  checkpoint_path = params.checkpoint_path

 tf.train.init_from_checkpoint(
  ckpt_dir_or_file=checkpoint_path,
  assignment_map={params.checkpoint_scope: params.checkpoint_scope} # 'optimizeloss/':'optimizeloss/'
 )

使用钩子 hooks。

可以在定义tf.contrib.learn.experiment的时候通过train_monitors参数指定

 # define the experiment
 experiment = tf.contrib.learn.experiment(
 estimator=estimator, # estimator
 train_input_fn=train_input_fn, # first-class function
 eval_input_fn=eval_input_fn, # first-class function
 train_steps=params.train_steps, # minibatch steps
 min_eval_frequency=params.eval_min_frequency, # eval frequency
 # train_monitors=[], # hooks for training
 # eval_hooks=[eval_input_hook], # hooks for evaluation
 eval_steps=params.eval_steps # use evaluation feeder until its empty
 )

也可以在定义tf.estimator.estimatorspec 的时候通过training_chief_hooks参数指定。

不过个人觉得最好还是在estimator中定义，让experiment只专注于控制实验的模式（训练次数,验证次数等等）。

def model_fn(features, labels, mode, params):

 # ....

 return tf.estimator.estimatorspec(
 mode=mode,
 predictions=predictions,
 loss=loss,
 train_op=train_op,
 eval_metric_ops=eval_metric_ops,
 # scaffold=get_scaffold(),
 # training_chief_hooks=none
 )

这里顺便解释以下tf.estimator.estimatorspec对像的作用。该对象描述来一个模型的方方面面。包括：

当前的模式：

mode: a modekeys. specifies if this is training, evaluation or prediction.

计算图

predictions: predictions tensor or dict of tensor.

loss: training loss tensor. must be either scalar, or with shape [1].

train_op: op for the training step.

eval_metric_ops: dict of metric results keyed by name. the values of the dict are the results of calling a metric function, namely a (metric_tensor, update_op) tuple. metric_tensor should be evaluated without any impact on state (typically is a pure computation results based on variables.). for example, it should not trigger the update_op or requires any input fetching.

导出策略

export_outputs: describes the output signatures to be exported to

savedmodel and used during serving. a dict {name: output} where:

name: an arbitrary name for this output.

output: an exportoutput object such as classificationoutput, regressionoutput, or predictoutput. single-headed models only need to specify one entry in this dictionary. multi-headed models should specify one entry for each head, one of which must be named using signature_constants.default_serving_signature_def_key.

chief钩子训练时的模型保存策略钩子checkpointsaverhook，模型恢复等

training_chief_hooks: iterable of tf.train.sessionrunhook objects to run on the chief worker during training.

worker钩子训练时的监控策略钩子如： nantensorhook loggingtensorhook 等

training_hooks: iterable of tf.train.sessionrunhook objects to run on all workers during training.

指定初始化和saver

scaffold: a tf.train.scaffold object that can be used to set initialization, saver, and more to be used in training.

evaluation钩子

evaluation_hooks: iterable of tf.train.sessionrunhook objects to run during evaluation.

自定义的钩子如下：

class restorecheckpointhook(tf.train.sessionrunhook):
 def __init__(self,
   checkpoint_path,
   exclude_scope_patterns,
   include_scope_patterns
   ):
 tf.logging.info("create restorecheckpointhook.")
 #super(iteratorinitializerhook, self).__init__()
 self.checkpoint_path = checkpoint_path

 self.exclude_scope_patterns = none if (not exclude_scope_patterns) else exclude_scope_patterns.split(',')
 self.include_scope_patterns = none if (not include_scope_patterns) else include_scope_patterns.split(',')


 def begin(self):
 # you can add ops to the graph here.
 print('before starting the session.')

 # 1. create saver

 #exclusions = []
 #if self.checkpoint_exclude_scopes:
 # exclusions = [scope.strip()
 #  for scope in self.checkpoint_exclude_scopes.split(',')]
 #
 #variables_to_restore = []
 #for var in slim.get_model_variables(): #tf.global_variables():
 # excluded = false
 # for exclusion in exclusions:
 # if var.op.name.startswith(exclusion):
 # excluded = true
 # break
 # if not excluded:
 # variables_to_restore.append(var)
 #inclusions
 #[var for var in tf.trainable_variables() if var.op.name.startswith('inceptionresnetv1')]

 variables_to_restore = tf.contrib.framework.filter_variables(
  slim.get_model_variables(),
  include_patterns=self.include_scope_patterns, # ['conv'],
  exclude_patterns=self.exclude_scope_patterns, # ['biases', 'logits'],

  # if true (default), performs re.search to find matches
  # (i.e. pattern can match any substring of the variable name).
  # if false, performs re.match (i.e. regexp should match from the beginning of the variable name).
  reg_search = true
 )
 self.saver = tf.train.saver(variables_to_restore)


 def after_create_session(self, session, coord):
 # when this is called, the graph is finalized and
 # ops can no longer be added to the graph.

 print('session created.')

 tf.logging.info('fine-tuning from %s' % self.checkpoint_path)
 self.saver.restore(session, os.path.expanduser(self.checkpoint_path))
 tf.logging.info('end fineturn from %s' % self.checkpoint_path)

 def before_run(self, run_context):
 #print('before calling session.run().')
 return none #sessionrunargs(self.your_tensor)

 def after_run(self, run_context, run_values):
 #print('done running one step. the value of my tensor: %s', run_values.results)
 #if you-need-to-stop-loop:
 # run_context.request_stop()
 pass


 def end(self, session):
 #print('done with the session.')
 pass

以上这篇tensorflow estimator 使用hook实现finetune方式就是小编分享给大家的全部内容了，希望能给大家一个参考，也希望大家多多支持。

上一篇：双向RNN:bidirectional_dynamic_rnn()函数的使用详解

下一篇： Python struct模块解析