(转)深度学习1——Maxout网络学习
程序员文章站
2023-12-30 19:38:22
...
原
深度学习(二十三)Maxout网络学习
2015年12月27日 22:45:16
hjimce
阅读数 36944更多
所属专栏:
深度学习
</div>
</div>
<div class="operating">
</div>
</div>
</div>
</div>
<article class="baidu_pl">
<div id="article_content" class="article_content clearfix" data-report-view="{"mod":"popu_307","dest":"https://blog.csdn.net/hjimce/article/details/50414467"}" data-report-click="{"mod":"popu_307","dest":"https://blog.csdn.net/hjimce/article/details/50414467"}" style="height: 1779px; overflow: hidden;">
<div class="article-copyright">
版权声明:本文为博主原创文章,欢迎转载,转载请注明原文地址、作者信息。 https://blog.csdn.net/hjimce/article/details/50414467 </div>
<link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/ck_htmledit_views-3019150162.css">
<link rel="stylesheet" href="https://csdnimg.cn/release/phoenix/template/css/ck_htmledit_views-3019150162.css">
<div class="htmledit_views" id="content_views">
<p style="text-align:center;"><span style="font-family:Arial;"><span style="font-size:24px;"><strong>Maxout网络学习</strong></span><br></span></p><p><span style="font-family:Arial;font-size:18px;"><strong>原文地址</strong>:<a href="http://blog.csdn.net/hjimce/article/details/50414467" rel="nofollow">http://blog.csdn.net/hjimce/article/details/50414467</a></span></p><p><span style="font-family:Arial;font-size:18px;"><strong>作者</strong>:hjimce</span></p><p><span style="font-family:Arial;font-size:18px;"><strong>一、相关理论</strong></span></p><p><span style="font-family:Arial;font-size:18px;"> 本篇博文主要讲解2013年,ICML上的一篇文献:《Maxout Networks》,这个算法我目前也很少用到,个人感觉最主要的原因应该是这个算法参数个数会成k倍增加(k是maxout的一个参数),不过没关系,对于我们来说知识积累才是最重要的,指不定某一天我们就需要用到这个算法,技多不压身。</span><span style="font-family:Arial;font-size:18px;">个人感觉Maxout网络和Dropout有很多相似的地方。</span></p><p><span style="font-family:Arial;font-size:18px;"> 本篇博文将从什么是maxout网络讲起,先解答maxout的源码层实现,因为很多人最感兴趣的还是算法要怎么实现,当然我也是这样的。我看文献,一般最在意的还是源码的实现,有的文献理论公式推导了十几页,结果5行代码搞定,我看到想哭,这也许就是我讨厌做学术研究的原因吧。知道了源码怎么实现后,我们简单啰嗦一下maxout相关的理论意义。</span></p><p></p><p><span style="font-family:Arial;font-size:18px;"><strong>二、Maxout算法流程</strong></span></p><p><span style="font-family:Arial;font-size:18px;"><strong>1、算法概述</strong></span></p><p><span style="font-family:Arial;font-size:18px;">开始前我们先讲解什么叫maxout networks,等我们明白了什么叫maxout 网络后,再对maxout的相理论意义做出解释。Maxout是深度学习网络中的一层网络,就像池化层、卷积层一样等,我们可以把maxout 看成是网络的**函数层,这个后面再讲解,本部分我们要先知道什么是maxout。我们假设网络某一层的输入特征向量为:X=(x1,x2,……xd),也就是我们输入是d个神经元。Maxout隐藏层每个神经元的计算公式如下:</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"> <img src="https://img-blog.csdn.net/20160102195335153?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""></span></p><p><span style="font-family:Arial;font-size:18px;">上面的公式就是maxout隐藏层神经元i的计算公式。其中,k就是maxout层所需要的参数了,由我们人为设定大小。就像dropout一样,也有自己的参数p(每个神经元dropout概率),maxout的参数是k。公式中Z的计算公式为:</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"> <img src="https://img-blog.csdn.net/20160102195353046?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""></span></p><p><span style="font-family:Arial;font-size:18px;">权重w是一个大小为(d,m,k)三维矩阵,b是一个大小为(m,k)的二维矩阵,这两个就是我们需要学习的参数。如果我们设定参数k=1,那么这个时候,网络就类似于以前我们所学普通的MLP网络。</span></p><p><span style="font-family:Arial;font-size:18px;">我们可以这么理解,本来传统的MLP算法在第i层到第i+1层,参数只有一组,然而现在我们不怎么干了,我们在这一层同时训练n组参数,然后选择**值最大的作为下一层神经元的**值。下面还是用一个例子进行讲解,比较容易搞懂。</span></p><p><span style="font-family:Arial;font-size:18px;">为了简单起见,假设我们网络第i层有2个神经元x1、x2,第i+1层的神经元个数为1个,如下图所示:</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><img src="https://img-blog.csdn.net/20160103170530445?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""><br></span></p><p><span style="font-family:Arial;font-size:18px;">(1)以前MLP的方法。我们要计算第i+1层,那个神经元的**值的时候,传统的MLP计算公式就是:</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;">z=W*X+b</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;">out=f(z)</span></p><p><span style="font-family:Arial;font-size:18px;">其中f就是我们所谓的**函数,比如Sigmod、Relu、Tanh等。</span></p><p><span style="font-family:Arial;font-size:18px;">(2)Maxout 的方法。如果我们设置maxout的参数k=5,maxout层就如下所示:</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><img src="https://img-blog.csdn.net/20160103165828920?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""><br></span></p><p style="text-align:left;"><span style="font-family:Arial;font-size:18px;">相当于在每个输出神经元前面又多了一层。这一层有5个神经元,此时maxout网络的输出计算公式为:</span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;">z1=w1*x+b1</span><br></span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;">z2=w2*x+b2</span><br></span></span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;">z3=w3*x+b3</span><br></span></span></span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;">z4=w4*x+b4</span><br></span></span></span></span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;">z5=w5*x+b5</span><br></span></span></span></span></span></p><p style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;"><span style="font-family:Arial;font-size:18px;">out=max(z1,z2,z3,z4,z5)</span></span></span></span></span></span></p><p style="text-align:left;"><span style="font-size:18px;">所以这就是为什么采用maxout的时候,参数个数成k倍增加的原因。本来我们只需要一组参数就够了,采用maxout后,就需要有k组参数。</span></p><p><span style="font-family:Arial;font-size:18px;"><strong>三、源码实现</strong></span></p><p><span style="font-family:Arial;font-size:18px;">o</span><span style="font-size:18px;"><span style="font-family:Arial;">k,为了学习maxout源码的实现过程,我这边引用keras的源码maxout的实现,进行讲解。keras的网站为:<a href="http://keras.io/" rel="nofollow">http://keras.io/</a> 。</span><span style="font-family:Arial;">项目源码网站为:</span><a href="https://github.com/fchollet/keras" rel="nofollow" style="font-family:Arial;">https://github.com/fchollet/keras</a>。下面是keras关于maxout网络层的实现函数:</span></p><p><span style="font-size:18px;"></span></p><pre><code class="hljs ruby"><ol class="hljs-ln" style="width:866px"><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="1"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-comment">#maxout 网络层类的定义</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="2"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-class"><span class="hljs-keyword">class</span> <span class="hljs-title">MaxoutDense</span>(<span class="hljs-title">Layer</span>):</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="3"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-comment"># 网络输入数据矩阵大小为(nb_samples, input_dim)</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="4"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-comment"># 网络输出数据矩阵大小为(nb_samples, output_dim)</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="5"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> input_ndim = <span class="hljs-number">2</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="6"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-comment">#nb_feature就是我们前面说的k的个数了,这个是maxout层特有的参数</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="7"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">__init__</span><span class="hljs-params"><span class="hljs-params">(<span class="hljs-keyword">self</span>, output_dim, nb_feature=<span class="hljs-number">4</span>,</span></span></span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="8"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-params"> init=<span class="hljs-string">'glorot_uniform'</span>, weights=None,</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="9"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-params"> W_regularizer=None, b_regularizer=None, activity_regularizer=None,</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="10"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"><span class="hljs-params"> W_constraint=None, b_constraint=None, input_dim=None, **kwargs)</span>:</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="11"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.output_dim = output_dim</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="12"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.nb_feature = nb_feature</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="13"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.init = initializations.get(init)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="14"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="15"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.W_regularizer = regularizers.get(W_regularizer)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="16"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.b_regularizer = regularizers.get(b_regularizer)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="17"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.activity_regularizer = regularizers.get(activity_regularizer)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="18"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="19"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.W_constraint = constraints.get(W_constraint)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="20"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.b_constraint = constraints.get(b_constraint)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="21"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.constraints = [<span class="hljs-keyword">self</span>.W_constraint, <span class="hljs-keyword">self</span>.b_constraint]</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="22"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="23"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.initial_weights = weights</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="24"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.input_dim = input_dim</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="25"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">if</span> <span class="hljs-keyword">self</span>.<span class="hljs-symbol">input_dim:</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="26"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> kwargs[<span class="hljs-string">'input_shape'</span>] = (<span class="hljs-keyword">self</span>.input_dim,)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="27"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.input = K.placeholder(ndim=<span class="hljs-number">2</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="28"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">super</span>(MaxoutDense, <span class="hljs-keyword">self</span>).__init_<span class="hljs-number">_</span>(**kwargs)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="29"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-comment">#参数初始化部分</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="30"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">build</span><span class="hljs-params">(<span class="hljs-keyword">self</span>)</span></span>:</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="31"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> input_dim = <span class="hljs-keyword">self</span>.input_shape[<span class="hljs-number">1</span>]</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="32"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="33"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.W = <span class="hljs-keyword">self</span>.init((<span class="hljs-keyword">self</span>.nb_feature, input_dim, <span class="hljs-keyword">self</span>.output_dim))<span class="hljs-comment">#nb_feature是我们上面说的k。</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="34"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.b = K.zeros((<span class="hljs-keyword">self</span>.nb_feature, <span class="hljs-keyword">self</span>.output_dim))</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="35"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="36"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.params = [<span class="hljs-keyword">self</span>.W, <span class="hljs-keyword">self</span>.b]</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="37"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.regularizers = []</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="38"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="39"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">if</span> <span class="hljs-keyword">self</span>.<span class="hljs-symbol">W_regularizer:</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="40"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.W_regularizer.set_param(<span class="hljs-keyword">self</span>.W)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="41"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.regularizers.append(<span class="hljs-keyword">self</span>.W_regularizer)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="42"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="43"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">if</span> <span class="hljs-keyword">self</span>.<span class="hljs-symbol">b_regularizer:</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="44"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.b_regularizer.set_param(<span class="hljs-keyword">self</span>.b)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="45"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.regularizers.append(<span class="hljs-keyword">self</span>.b_regularizer)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="46"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="47"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">if</span> <span class="hljs-keyword">self</span>.<span class="hljs-symbol">activity_regularizer:</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="48"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.activity_regularizer.set_layer(<span class="hljs-keyword">self</span>)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="49"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.regularizers.append(<span class="hljs-keyword">self</span>.activity_regularizer)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="50"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="51"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">if</span> <span class="hljs-keyword">self</span>.initial_weights is <span class="hljs-keyword">not</span> <span class="hljs-symbol">None:</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="52"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">self</span>.set_weights(<span class="hljs-keyword">self</span>.initial_weights)</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="53"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> del <span class="hljs-keyword">self</span>.initial_weights</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="54"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> </div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="55"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">get_output</span><span class="hljs-params">(<span class="hljs-keyword">self</span>, train=False)</span></span>:</div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="56"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> X = <span class="hljs-keyword">self</span>.get_input(train)<span class="hljs-comment">#需要切记这个x的大小是(nsamples,input_num) </span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="57"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-comment"># -- don't need activation since it's just linear.</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="58"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> output = K.max(K.dot(X, <span class="hljs-keyword">self</span>.W) + <span class="hljs-keyword">self</span>.b, axis=<span class="hljs-number">1</span>)<span class="hljs-comment">#maxout**函数</span></div></div></li><li><div class="hljs-ln-numbers"><div class="hljs-ln-line hljs-ln-n" data-line-number="59"></div></div><div class="hljs-ln-code"><div class="hljs-ln-line"> <span class="hljs-keyword">return</span> output</div></div></li></ol></code><div class="hljs-button {2}" data-title="复制" onclick="hljs.copyCode(event)"></div></pre><span style="font-family:Arial;font-size:18px;">看上面的代码的话,其实只需要看get_output()函数,就知道maxout的实现了。所以说有的时候,一篇文献的代码,其实就只有几行代码,maxout就仅仅只有一行代码而已:</span><p></p><p><span style="font-family:Arial;font-size:18px;"></span></p><pre><code class="hljs swift">output = <span class="hljs-type">K</span>.<span class="hljs-built_in">max</span>(<span class="hljs-type">K</span>.dot(<span class="hljs-type">X</span>, <span class="hljs-keyword">self</span>.<span class="hljs-type">W</span>) + <span class="hljs-keyword">self</span>.b, axis=<span class="hljs-number">1</span>)#maxout**函数</code><div class="hljs-button {2}" data-title="复制" onclick="hljs.copyCode(event)"></div></pre><p></p><p><span style="font-family:Arial;font-size:18px;">下面在简单啰嗦一下相关的理论,毕竟文献的作者写了那么多页,我们总得看一看才行。Maxout可以看成是一个**函数 ,然而它与原来我们以前所学的**函数又有所不同。传统的**函数:</span></p><div style="text-align:center;"><span style="font-family:Arial;font-size:18px;"><img src="https://img-blog.csdn.net/20160102210447410?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""></span></div><p></p><p><span style="font-family:Arial;"><span style="font-size:18px;">比如阈值函数、S函数等。maxout**函数,它具有如下性质:</span></span></p><p style="text-align:left;"><span style="font-family:Arial;"><span style="font-size:18px;">1、maxout**函数并不是一个固定的函数,不像Sigmod、Relu、Tanh等函数,是一个固定的函数方程</span></span></p><p style="text-align:left;"><span style="font-family:Arial;"><span style="font-size:18px;">2、它是一个可学习的**函数,因为我们W参数是学习变化的。</span></span></p><p style="text-align:left;"><span style="font-family:Arial;"><span style="font-size:18px;">3、它是一个分段线性函数:</span></span></p><p style="text-align:center;"><span style="font-family:Arial;"><span style="font-size:18px;"><img src="https://img-blog.csdn.net/20160102203555432?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""><br></span></span></p><p style="text-align:left;"><span style="font-family:Arial;"><span style="font-size:18px;">然而任何一个凸函数,都可以由线性分段函数进行逼近近似。其实我们可以把以前所学到的**函数:relu、abs**函数,看成是分成两段的线性函数,如下示意图所示:<br></span></span></p><div style="text-align:center;"><span style="font-family:Arial;"><span style="font-size:18px;"><img src="https://img-blog.csdn.net/20160102210809727?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""></span></span></div><div><span style="font-size:18px;"> maxout的拟合能力是非常强的,它可以拟合任意的的凸函数。最直观的解释就是任意的凸函数都可以由分段线性函数以任意精度拟合(学过高等数学应该能明白),而maxout又是取k个隐隐含层节点的最大值,这些”隐隐含层"节点也是线性的,所以在不同的取值范围下,最大值也可以看做是分段线性的(分段的个数与k值有关)-本段摘自:<a href="http://www.cnblogs.com/tornadomeet/p/3428843.html" rel="nofollow">http://www.cnblogs.com/tornadomeet/p/3428843.html</a><br></span></div><p></p><p style="text-align:left;"><span style="font-size:18px;"><span style="font-family:Arial;">maxout是一个函数逼近器<strong>,</strong></span><span style="font-family:Arial;">对于一个标准的MLP网络来说,如果隐藏层的神经元足够多,那么理论上我们是可以逼近任意的函数的。类似的,对于maxout 网络也是一个函数逼近器。</span></span></p><p style="text-align:left;"><span style="font-family:Arial;"><span style="font-size:18px;">定理1:对于任意的一个连续分段线性函数g(v),我们可以找到两个凸的分段线性函数h1(v)、h2(v),使得这两个凸函数的差值为g(v):</span></span></p><p style="text-align:center;"><span style="font-family:Arial;"><span style="font-size:18px;"><img src="https://img-blog.csdn.net/20160103165034312?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""><br></span></span></p><p style="text-align:center;"><span style="font-family:Arial;"><span style="font-size:18px;"><img src="https://img-blog.csdn.net/20160103165143327?watermark/2/text/aHR0cDovL2Jsb2cuY3Nkbi5uZXQv/font/5a6L5L2T/fontsize/400/fill/I0JBQkFCMA==/dissolve/70/gravity/Center" alt=""><br></span></span></p><p style="text-align:left;"><span style="font-size:18px;">参考文献:</span></p><p style="text-align:left;"><span style="font-size:18px;">1、<span style="font-family:Arial;">《Maxout Networks》</span></span></p><p style="text-align:left;"><span style="font-family:Arial;"><span style="font-size:18px;">2、<a href="http://www.cnblogs.com/tornadomeet/p/3428843.html" rel="nofollow">http://www.cnblogs.com/tornadomeet/p/3428843.html</a></span></span></p><p style="text-align:left;"><span style="font-size:18px;"><span style="color:rgb(51,51,51);font-family:Arial;line-height:26px;">**********************作者:hjimce 时间:2015.12.20 联系QQ:1393852684 </span><span style="color:rgb(51,51,51);font-family:Arial;font-size:18px;line-height:26px;">原创文章,转载请保留原文地址、作者等信息***************</span><span style="color:rgb(51,51,51);font-family:Arial;line-height:26px;"></span></span></p> </div>
</div>
</article>