Keras—embedding嵌入层的用法详解
最近在工作中进行了nlp的内容,使用的还是keras中embedding的词嵌入来做的。
keras中embedding层做一下介绍。
中文文档地址:
参数如下:
其中参数重点有input_dim,output_dim,非必选参数input_length.
初始化方法参数设置后面会单独总结一下。
demo使用预训练(使用百度百科(word2vec)的语料库)参考
embedding使用的demo参考:
def create_embedding(word_index, num_words, word2vec_model): embedding_matrix = np.zeros((num_words, embedding_dim)) for word, i in word_index.items(): try: embedding_vector = word2vec_model[word] embedding_matrix[i] = embedding_vector except: continue return embedding_matrix #word_index:词典(统计词转换为索引) #num_word:词典长度+1 #word2vec_model:词向量的model
加载词向量model的方法:
def pre_load_embedding_model(model_file): # model = gensim.models.word2vec.load(model_file) # model = gensim.models.word2vec.load(model_file,binary=true) model = gensim.models.keyedvectors.load_word2vec_format(model_file) return model
model中embedding层的设置(注意参数,input层的输入,初始化方法):
embedding_matrix = create_embedding(word_index, num_words, word2vec_model) embedding_layer = embedding(num_words, embedding_dim, embeddings_initializer=constant(embedding_matrix), input_length=max_sequence_length, trainable=false) sequence_input = input(shape=(max_sequence_length,), dtype='int32') embedded_sequences = embedding_layer(sequence_input)
embedding层的初始化设置
keras embeding设置初始值的两种方式
随机初始化embedding
from keras.models import sequential from keras.layers import embedding import numpy as np model = sequential() model.add(embedding(1000, 64, input_length=10)) # the model will take as input an integer matrix of size (batch, input_length). # the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size). # now model.output_shape == (none, 10, 64), where none is the batch dimension. input_array = np.random.randint(1000, size=(32, 10)) model.compile('rmsprop', 'mse') output_array = model.predict(input_array) print(output_array) assert output_array.shape == (32, 10, 64)
使用weights参数指明embedding初始值
import numpy as np import keras m = keras.models.sequential() """ 可以通过weights参数指定初始的weights参数 因为embedding层是不可导的 梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层 注意weights到embeddings的绑定过程很复杂,weights是一个列表 """ embedding = keras.layers.embedding(input_dim=3, output_dim=2, input_length=1, weights=[np.arange(3 * 2).reshape((3, 2))], mask_zero=true) m.add(embedding) # 一旦add,就会自动调用embedding的build函数, print(keras.backend.get_value(embedding.embeddings)) m.compile(keras.optimizers.rmsprop(), keras.losses.mse) print(m.predict([1, 2, 2, 1, 2, 0])) print(m.get_layer(index=0).get_weights()) print(keras.backend.get_value(embedding.embeddings))
给embedding设置初始值的第二种方式:使用initializer
import numpy as np import keras m = keras.models.sequential() """ 可以通过weights参数指定初始的weights参数 因为embedding层是不可导的 梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层 给embedding设置权值的第二种方式,使用constant_initializer """ embedding = keras.layers.embedding(input_dim=3, output_dim=2, input_length=1, embeddings_initializer=keras.initializers.constant(np.arange(3 * 2, dtype=np.float32).reshape((3, 2)))) m.add(embedding) print(keras.backend.get_value(embedding.embeddings)) m.compile(keras.optimizers.rmsprop(), keras.losses.mse) print(m.predict([1, 2, 2, 1, 2])) print(m.get_layer(index=0).get_weights()) print(keras.backend.get_value(embedding.embeddings))
关键的难点在于理清weights是怎么传入到embedding.embeddings张量里面去的。
embedding是一个层,继承自layer,layer有weights参数,weights参数是一个list,里面的元素都是numpy数组。在调用layer的构造函数的时候,weights参数就被存储到了_initial_weights变量
basic_layer.py 之layer类
if 'weights' in kwargs: self._initial_weights = kwargs['weights'] else: self._initial_weights = none
当把embedding层添加到模型中、跟模型的上一层进行拼接的时候,会调用layer(上一层)函数,此处layer是embedding实例,embedding是一个继承了layer的类,embedding类没有重写__call__()方法,layer实现了__call__()方法。
父类layer的__call__方法调用子类的call()方法来获取结果。
所以最终调用的是layer.__call__()。在这个方法中,会自动检测该层是否build过(根据self.built布尔变量)。
layer.__call__函数非常重要。
def __call__(self, inputs, **kwargs): """wrapper around self.call(), for handling internal references. if a keras tensor is passed: - we call self._add_inbound_node(). - if necessary, we `build` the layer to match the _keras_shape of the input(s). - we update the _keras_shape of every input tensor with its new shape (obtained via self.compute_output_shape). this is done as part of _add_inbound_node(). - we update the _keras_history of the output tensor(s) with the current layer. this is done as part of _add_inbound_node(). # arguments inputs: can be a tensor or list/tuple of tensors. **kwargs: additional keyword arguments to be passed to `call()`. # returns output of the layer's `call` method. # raises valueerror: in case the layer is missing shape information for its `build` call. """ if isinstance(inputs, list): inputs = inputs[:] with k.name_scope(self.name): # handle laying building (weight creating, input spec locking). if not self.built:#如果未曾build,那就要先执行build再调用call函数 # raise exceptions in case the input is not compatible # with the input_spec specified in the layer constructor. self.assert_input_compatibility(inputs) # collect input shapes to build layer. input_shapes = [] for x_elem in to_list(inputs): if hasattr(x_elem, '_keras_shape'): input_shapes.append(x_elem._keras_shape) elif hasattr(k, 'int_shape'): input_shapes.append(k.int_shape(x_elem)) else: raise valueerror('you tried to call layer "' + self.name + '". this layer has no information' ' about its expected input shape, ' 'and thus cannot be built. ' 'you can build it manually via: ' '`layer.build(batch_input_shape)`') self.build(unpack_singleton(input_shapes)) self.built = true#这句话其实有些多余,因为self.build函数已经把built置为true了 # load weights that were specified at layer instantiation. if self._initial_weights is not none:#如果传入了weights,把weights参数赋值到每个变量,此处会覆盖上面的self.build函数中的赋值。 self.set_weights(self._initial_weights) # raise exceptions in case the input is not compatible # with the input_spec set at build time. self.assert_input_compatibility(inputs) # handle mask propagation. previous_mask = _collect_previous_mask(inputs) user_kwargs = copy.copy(kwargs) if not is_all_none(previous_mask): # the previous layer generated a mask. if has_arg(self.call, 'mask'): if 'mask' not in kwargs: # if mask is explicitly passed to __call__, # we should override the default mask. kwargs['mask'] = previous_mask # handle automatic shape inference (only useful for theano). input_shape = _collect_input_shape(inputs) # actually call the layer, # collecting output(s), mask(s), and shape(s). output = self.call(inputs, **kwargs) output_mask = self.compute_mask(inputs, previous_mask) # if the layer returns tensors from its inputs, unmodified, # we copy them to avoid loss of tensor metadata. output_ls = to_list(output) inputs_ls = to_list(inputs) output_ls_copy = [] for x in output_ls: if x in inputs_ls: x = k.identity(x) output_ls_copy.append(x) output = unpack_singleton(output_ls_copy) # inferring the output shape is only relevant for theano. if all([s is not none for s in to_list(input_shape)]): output_shape = self.compute_output_shape(input_shape) else: if isinstance(input_shape, list): output_shape = [none for _ in input_shape] else: output_shape = none if (not isinstance(output_mask, (list, tuple)) and len(output_ls) > 1): # augment the mask to match the length of the output. output_mask = [output_mask] * len(output_ls) # add an inbound node to the layer, so that it keeps track # of the call and of all new variables created during the call. # this also updates the layer history of the output tensor(s). # if the input tensor(s) had not previous keras history, # this does nothing. self._add_inbound_node(input_tensors=inputs, output_tensors=output, input_masks=previous_mask, output_masks=output_mask, input_shapes=input_shape, output_shapes=output_shape, arguments=user_kwargs) # apply activity regularizer if any: if (hasattr(self, 'activity_regularizer') and self.activity_regularizer is not none): with k.name_scope('activity_regularizer'): regularization_losses = [ self.activity_regularizer(x) for x in to_list(output)] self.add_loss(regularization_losses, inputs=to_list(inputs)) return output
如果没有build过,会自动调用embedding类的build()函数。embedding.build()这个函数并不会去管weights,如果它使用的initializer没有传入,self.embeddings_initializer会变成随机初始化。
如果传入了,那么在这一步就能够把weights初始化好。
如果同时传入embeddings_initializer和weights参数,那么weights参数稍后会把embedding#embeddings覆盖掉。
embedding.py embedding类的build函数
def build(self, input_shape): self.embeddings = self.add_weight( shape=(self.input_dim, self.output_dim), initializer=self.embeddings_initializer, name='embeddings', regularizer=self.embeddings_regularizer, constraint=self.embeddings_constraint, dtype=self.dtype) self.built = true
综上,在keras中,使用weights给layer的变量赋值是一个比较通用的方法,但是不够直观。keras鼓励多多使用明确的initializer,而尽量不要触碰weights。
以上这篇keras—embedding嵌入层的用法详解就是小编分享给大家的全部内容了,希望能给大家一个参考,也希望大家多多支持。
下一篇: 情调