欢迎您访问程序员文章站本站旨在为大家提供分享程序员计算机编程知识!
您现在的位置是: 首页

caffe 关于Deconvolution的初始化注意事项

程序员文章站 2022-04-24 09:46:06
...

对于fcn,经常要使用到Deconvolution进行上采样。对于caffe使用者,使用Deconvolution上采样,其参数往往直接给定,不需要通过学习获得。

给定参数的方式很有意思,可以通过两种方式实现,但是这两种方式并非完全等价,各有各的价值。

第一种方式: 通过net_surgery给定,

这种方式最开始出现在FCN中。https://github.com/shelhamer/fcn.berkeleyvision.org/blob/master/voc-fcn32s/solve.py
代码如下:

import caffe
import surgery, score

import numpy as np
import os
import sys

try:
    import setproctitle
    setproctitle.setproctitle(os.path.basename(os.getcwd()))
except:
    pass

weights = '../ilsvrc-nets/vgg16-fcn.caffemodel'

# init
caffe.set_device(int(sys.argv[1]))
caffe.set_mode_gpu()

solver = caffe.SGDSolver('solver.prototxt')
solver.net.copy_from(weights)

# surgeries  (这里就是对于反卷积层的参数进行初始化)
interp_layers = [k for k in solver.net.params.keys() if 'up' in k]
surgery.interp(solver.net, interp_layers)

# scoring
val = np.loadtxt('../data/segvalid11.txt', dtype=str)

for _ in range(25):
    solver.step(4000)
    score.seg_tests(solver, False, val, layer='score')

上采样的函数:

 #  make a bilinear interpolation kernel
    def upsample_filt(self,size):
        factor = (size + 1) // 2
        if size % 2 == 1:
            center = factor - 1
        else:
            center = factor - 0.5
        og = np.ogrid[:size, :size]
        return (1 - abs(og[0] - center) / factor) * \
               (1 - abs(og[1] - center) / factor)

    # set parameters s.t. deconvolutional layers compute bilinear interpolation
    # N.B. this is for deconvolution without groups
    def interp_surgery(self,net, layers):
        for l in layers:
            print l
            m, k, h, w = net.params[l][0].data.shape   #仅仅修改w,不需要修改bias,其为0
            print("deconv shape:\n")
            print m, k, h, w 
            if m != k and k != 1:
                print 'input + output channels need to be the same or |output| == 1'
                raise
            if h != w:
                print 'filters need to be square'
                raise
            filt = self.upsample_filt(h)
            print(filt)
            net.params[l][0].data[range(m), range(k), :, :] = filt

第二种方式:直接在Deconvolution中给定参数weight_filler,即:

代码如下:

layer {
  name: "fc8_upsample"
  type: "Deconvolution"
  bottom: "fc8"
  top: "fc8_upsample"
  param {
    lr_mult: 0
    decay_mult: 0
  }
  param {
    lr_mult: 0
    decay_mult: 0
  }
  convolution_param {
    num_output: 1
    kernel_size: 16
    stride: 8
    pad: 3
    weight_filler {   # 这里相当于上面的直接赋值
      type: "bilinear"
    }
  }
}

weight_filler初始化成双线性就等价于直接按照上面的方式赋值。

看起来好像以上两种方法一样,但是实际上有不同。主要区别在对于num_output>1的情形。

比如对于一个输入是2个通道的map,希望对其进行上采样,自然我们希望分别对于map放大即可。如果使用Deconvolution,则shape大小为2,2,16,16(设其大小为16*16).不考虑bias项。

假设按照上面的方式初始化,则对于第一种方法,得到结果:
[0,0,:,:]:

[[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]]
[0,1,:,:]:

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[1,0,:,:]:

[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
[1,1,:,:]:

[[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05859375 0.17578125 0.29296875 0.41015625 0.52734375 0.64453125
0.76171875 0.87890625 0.87890625 0.76171875 0.64453125 0.52734375
0.41015625 0.29296875 0.17578125 0.05859375]
[ 0.05078125 0.15234375 0.25390625 0.35546875 0.45703125 0.55859375
0.66015625 0.76171875 0.76171875 0.66015625 0.55859375 0.45703125
0.35546875 0.25390625 0.15234375 0.05078125]
[ 0.04296875 0.12890625 0.21484375 0.30078125 0.38671875 0.47265625
0.55859375 0.64453125 0.64453125 0.55859375 0.47265625 0.38671875
0.30078125 0.21484375 0.12890625 0.04296875]
[ 0.03515625 0.10546875 0.17578125 0.24609375 0.31640625 0.38671875
0.45703125 0.52734375 0.52734375 0.45703125 0.38671875 0.31640625
0.24609375 0.17578125 0.10546875 0.03515625]
[ 0.02734375 0.08203125 0.13671875 0.19140625 0.24609375 0.30078125
0.35546875 0.41015625 0.41015625 0.35546875 0.30078125 0.24609375
0.19140625 0.13671875 0.08203125 0.02734375]
[ 0.01953125 0.05859375 0.09765625 0.13671875 0.17578125 0.21484375
0.25390625 0.29296875 0.29296875 0.25390625 0.21484375 0.17578125
0.13671875 0.09765625 0.05859375 0.01953125]
[ 0.01171875 0.03515625 0.05859375 0.08203125 0.10546875 0.12890625
0.15234375 0.17578125 0.17578125 0.15234375 0.12890625 0.10546875
0.08203125 0.05859375 0.03515625 0.01171875]
[ 0.00390625 0.01171875 0.01953125 0.02734375 0.03515625 0.04296875
0.05078125 0.05859375 0.05859375 0.05078125 0.04296875 0.03515625
0.02734375 0.01953125 0.01171875 0.00390625]]
而第二种方式全部都是[0,0,:,:]这样的矩阵。

以上两种方法应该是第一种对的。因为Deconvolution 其实与卷积类似,按照第一种结果才能分别单独地对map上采样,而采用第二种则将会得到两个相同的map。(因为综合了两个输入map的信息)

因此结论: 对于多个输入输出的Deconvolution,采用方法1,对于单个输入的,方法1,2通用。

附上Deconvolution的官方编码:

caffe 关于Deconvolution的初始化注意事项

说明:
以上的称述有点瑕疵,其实caffe已经解决了上述的问题,我之前没有好好留意。 关键就在group这个选项。
如果num_output>1,则填上group: c 再加上weight_filler: { type: “bilinear” },即可完成初始化。