首页 > 代码库 > Python实现神经网络

Python实现神经网络

上篇博客 python利用梯度下降求多元线性回归讲解了利用梯度下降实现多元线性回归,但是它拟合的是线性函数,这篇博客要在上一篇的博客基础上加上非线性单元,实现一个最简单的神经网络

1. 最简单的神经网络

上一篇博客线性回归:y=w0?x0+w1?x0+...+wn?xn<script type="math/tex" id="MathJax-Element-1">y=w_0*x_0+w_1*x_0+...+w_n*x_n</script>,要加上一个非线性的sigmoid函数

f(x)=11+e(?x)f(x)=f(x)?(1?f(x))
<script type="math/tex; mode=display" id="MathJax-Element-2">f(x)=\frac{1}{1+e^{(-x)}}\qquad f‘(x)= f(x)*(1-f(x))</script>
y=f(w0?x0+w1?x1+...+wn?xn)
<script type="math/tex; mode=display" id="MathJax-Element-3">y=f(w_0*x_0+w_1*x_1+...+w_n*x_n)</script>
以n=1为例用pyton实现这个最简单的神经网络(无隐层)y=f(w0?x0+w1?x1)<script type="math/tex" id="MathJax-Element-4">y=f(w_0*x_0+w_1*x_1)</script>
cost函数: 12?(f(w0?x0+w1?x1)?y)2<script type="math/tex" id="MathJax-Element-5">\frac {1}{2}*(f(w_0*x_0+w_1*x_1)-y)^2</script>
?cost?w0=f(w0?x0+w1?x1)=F(1?F)?x0F=f(w0?x0+w1?x1)w1
<script type="math/tex; mode=display" id="MathJax-Element-6">\frac {\partial cost}{\partial w_0}= f‘(w_0*x_0+w_1*x_1)=F(1-F)*x_0 \quad F=f(w_0*x_0+w_1*x_1) \quad w_1类似</script>
向量形式
Y=f(X?W)?cost?W=X.T?f(X?W)(1?f(X?W))?(f(X?W)?Y)
<script type="math/tex; mode=display" id="MathJax-Element-7">Y=f(X*W)\qquad \frac {\partial cost}{\partial W}= X.T*f(X*W)(1-f(X*W))*(f(X*W)-Y)</script>

import numpy as np
# 非线性函数,deriv为False即为求f(x),deriv为True即为求f‘(x)
def nonlin(x,deriv=False):
    if deriv==True:
        return x*(1-x)
    else:
        return 1/(1+np.exp(-x))
x = np.array([  [0,0,1],
                [1,1,1],
                [1,0,1],
                [0,1,1] ])
y = np.array([[0,1,1,0]]).T
mu, sigma = 0, 0.1  # 均值与标准差
w = np.random.normal(mu, sigma, (3,1))
iter_size = 1000
lr = 1
for i in xrange(iter_size):
    # (data_num,weight_num)
    L0 = x
    #(data_num,weight_num)*(weight_num,1)= (data_num,1)
    L1 = nonlin(L0.dot(w))
    # (data_num,1)
    L1_loss = L1-y 
    # (data_num,1)
    L1_delta = L1_loss*nonlin(L1,True)
    # (weight_num,data_num) *(data_num,1)= (weight_num,1)
    grad = L0.T.dot(L1_delta)*lr
    w -= grad
 print L1

2. 包含隐层的神经网络

L0,X:(data_num,weight_num_1) W0:(weight_num_1,weight_num_2) L1:(data_num,weight_num_2)<script type="math/tex" id="MathJax-Element-8">L_0,X:(data\_num,weight\_num\_1) \ W_0:(weight\_num\_1,weight\_num\_2)\ L_1:(data\_num,weight\_num\_2)</script>
W1:(weight_num_2,1) L2:(data_num,1)<script type="math/tex" id="MathJax-Element-9">W_1:(weight\_num\_2,1)\ L_2:(data\_num,1)</script>

Y=f (f(X?W0)?W1)L0=XL1=f(L0?W0)L2=f(L1?w1)Y=L2
<script type="math/tex; mode=display" id="MathJax-Element-10">Y=f\ (f(X*W_0)*W_1)\qquad 令L_0=X,L_1=f(L_0*W_0), L_2=f(L_1*w_1),Y=L_2</script>
cost=12( f (f(X?W0)?W1)?Y)2
<script type="math/tex; mode=display" id="MathJax-Element-11">cost = \frac {1}{2}(\ f\ (f(X*W_0)*W_1)-Y)^2</script>
?cost?W1=( f (f(X?W0)?W1)?Y)?(f (f(X?W0)?W1)?(1?f (f(X?W0)?W1)?f(X?W0)
<script type="math/tex; mode=display" id="MathJax-Element-12">\frac {\partial cost}{\partial W_1}= (\ f\ (f(X*W_0)*W_1)-Y) *(f\ (f(X*W_0)*W_1)*(1-f\ (f(X*W_0)*W_1)*f(X*W_0)</script>
?cost?W1=L1.T?(L2?Y)?L2?(1?L2)
<script type="math/tex; mode=display" id="MathJax-Element-13">\frac {\partial cost}{\partial W_1}=L_1.T* (L_2-Y)*L_2*(1-L_2)</script>
?cost?W0=L0.T?(L2?Y)?L2?(1?L2)?W1.T
<script type="math/tex; mode=display" id="MathJax-Element-14">\frac {\partial cost}{\partial W_0}=L_0.T*(L_2-Y)*L_2*(1-L_2)*W_1.T</script>

import numpy as np
def nonlin(x,deriv=False):
    if deriv==True:
        return x*(1-x)
    else:
        return 1/(1+np.exp(-x))
x = np.array([  [0,0,1],
                [1,1,1],
                [1,0,1],
                [0,1,1] ])
y = np.array([[0,1,1,0]]).T
mu, sigma = 0, 0.1  # 均值与标准差
w0 = np.random.normal(mu, sigma, (3,5))
w1 = np.random.normal(mu, sigma, (5,1))
iter_size = 10000
lr = 1
for i in xrange(iter_size):
    # (data_num,weight_num_0)
    L0 = x 
    #(data_num,weight_num_0)*(weight_num_0,weight_num_1)= (data_num,weight_num_1)
    L1 = nonlin(L0.dot(w0))
    # (data_num,weight_num_1)*(weight_num_1,1)=(data_num,1)
    L2 = nonlin(L1.dot(w1))
    # (data_num,1)
    L2_loss = L2-y
    # (data_num,1) 
    L2_delta = L2_loss*nonlin(L2,True)
    #(weight_num_1,data_num) *(data_num,1)= (weight_num_1,1)
    grad1 = L1.T.dot(L2_delta) 
    w1 -= grad1*lr
    # (data_num,1)*(1,weight_num_1)=(data_num,weight_num_1)
    # L1对L2_loss贡献了多少,反过来传梯度时就要乘以这个权重
    L1_loss = L2_delta.dot(w1.T)
    # (data_num,weight_num_1)
    L1_delta = L1_loss*nonlin(L1,True)
    # (weight_num_0,data_num)*(data_num,weight_num_1)=(weight_num_0,weight_num_1)
    grad0 = L0.T.dot(L1_delta)
    w0 -= grad0*lr
print L2

3. 加上Dropout

......
    L1 = nonlin(L0.dot(w0))
    # 在L1后面加上
    if(do_dropout):
        L1 *= np.random.binomial([np.ones((len(x),w1_dim))],1-dropout_percent)[0]          * (1.0/(1-dropout_percent))
    # (data_num,weight_num_1)*(weight_num_1,1)=(data_num,1)
    L2 = nonlin(L1.dot(w1))
......
‘‘‘
详解以上代码:
L1:(data_num_w1_dim)
np.random.binomial([np.ones((len(x),w1_dim))],1-dropout_percent)[0]
[np.ones((len(x),w1_dim))] : (data_num,w1_dim)
return_value = http://www.mamicode.com/np.random.binomial(n,p,size=None) 二项分布,例如袋子中有黑白两种颜色的球若干,抽中黑球的概率是p,那么有放回的抽n次,抽中的黑球的次数是return_value,size是进行多少次这样的实验,一般可忽略。>

感谢大神博客11行Python实现神经网络3行Python实现dropout

<script type="text/javascript"> $(function () { $(‘pre.prettyprint code‘).each(function () { var lines = $(this).text().split(‘\n‘).length; var $numbering = $(‘
    ‘).addClass(‘pre-numbering‘).hide(); $(this).addClass(‘has-numbering‘).parent().append($numbering); for (i = 1; i <= lines; i++) { $numbering.append($(‘
  • ‘).text(i)); }; $numbering.fadeIn(1700); }); }); </script>

    Python实现神经网络