首页 > 代码库 > UFLDL tutorial 代码分析

UFLDL tutorial 代码分析

之前一直没怎么接触过代码,前段时间朋友提起了caffe。本想看看caffe怎么用,无奈自己太渣了,不会用……想起之前也没怎么接触过这方面知识,就从入门开始吧。本文代码来自UFLDL tutorial。

1.函数分析

    MATLAB代码,和UFLDL Tutorial对应。代码调用minFunc求解,可以先看第二部分再看此处。


minFunc : uncontrained optimizer using a line search strategy.(注:虽然此处没有表明,但方法只能求解无约束优化问题)

    函数采用下降方法求解最小值,可以参考凸优化和机器学习第3节的内容,当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。

Inputs : funObj, x0, options, varagin

    funObj 提供了代价函数和梯度

    x0 迭代过程中的初值

    options 函数参数传递

    varagin funObj中需要的参数.

Outputs : x, f, exitflag output

    x 迭代结果,最小值

    f 最小值处的代价函数的取值
    exitflag 退出时的状态
    output 函数运行时相关信息

输入参数(options)

DerivativeCheck

verbose & verboseI & debug & doPlot 通过DISPLAY控制, 决定运行过程中显示多少信息.

method METHOD控制, 表示采用的下降方法. LS_init,LS_type,LS_interp,LS_multi,Fref,Damped,HessianIter,c2 的取值和采用的方法有关,也能够自己指定(部分).

其他参数,包括maxFunEvals, maxIter, optTol, progTol, ...可以保持默认或指定

凸优化求解——下降方法

    参考凸优化和机器学习第3节的内容,当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。

处理流程

(以最速下降法为例,比较简单,主要是其他的我不会……)

    变量含义

    x:  此时的位置

    d: 下降方向

    t:  下降步长

1. 预处理

2. SD < NEWTON, Hessian矩阵不需要计算。采用funObj 计算f(代价函数值)和g(x点处的梯度)。

3. 循环,直到迭代次数达到或者最小步长/代价函数差值达到。

    对于最速下降法, 下降方向= 负梯度,因此d=-g

    之后采用直线搜索策略,找最小步长。默认采用回溯直线搜索方法。

    初始化t=1,调用函数ArmijoBacktrack计算即可。

4 . 对结果进行相应的检验和判断。


ArmijoBacktrack

    标准的回溯直线搜索方法。参数含义可参见注释

image

    本程序中c1代表alpha(默认值为1e-4),每次循环t=t/2。

2.代码

注:MATLAB 脚本(不是函数)开头最好加上以下几行,清楚工作区,变量,以及关闭图等。

clc;clear all;close all;

Linear Regression

    ex1a_linreg.m line 47

theta = minFunc(@linear_regression, theta, options, train.X, train.y);

    其中options 是struct,决定minFunc参数。train.x和train.y是输入数据集(训练),linear_regression是编写的函数,输入参数中的theta是初值,输出theta是求得的线性回归的权值。即

\[theta = \arg \min \left\{ {{{\left( {thet{a^T}*train.x - train.y} \right)}^2}} \right\}\]

    minFunc 起到了求解上式的功能。

    linear_regression(注释已略去)

function [f,g] = linear_regression(theta, X,y)  f=0;  g=zeros(size(theta));  yEst=theta*X;  f=(y-yEst)*(y-yEst);  g=X*(yEst-y);  end

     minFunc计算过程中需要得知代价函数和梯度,以上代码提供了这一功能。

Logistic Regression

function [f,g] = logistic_regression(theta, X,y)  %  % Arguments:  %   theta - A column vector containing the parameter values to optimize.  %   X - The examples stored in a matrix.    %       X(i,j) is the ith coordinate of the jth example.  %   y - The label for each example.  y(j) is the jth examples label.  %  m=size(X,2);    % initialize objective value and gradient.  f = 0;  g = zeros(size(theta));%%% YOUR CODE HERE %%%  h = 1./(1+exp(-theta*X));  f = -y*log(h)+(y-1)*log(1-h);  g = X*(h-y);
View Code

Softmax Regression(供参考)

function [f,g] = softmax_regression(theta, X,y)  %  % Arguments:  %   theta - A vector containing the parameter values to optimize.  %       In minFunc, theta is reshaped to a long vector.  So we need to  %       resize it to an n-by-(num_classes-1) matrix.  %       Recall that we assume theta(:,num_classes) = 0.  %  %   X - The examples stored in a matrix.    %       X(i,j) is the ith coordinate of the jth example.  %   y - The label for each example.  y(j) is the jth examples label.  %  m=size(X,2);  n=size(X,1);  % theta is a vector;  need to reshape to n x num_classes.  theta=reshape(theta, n, []);  num_classes=size(theta,2)+1;    % initialize objective value and gradient.  %  % TODO:  Compute the softmax objective function and gradient using vectorized code.  %        Store the objective function value in f, and the gradient in g.  %        Before returning g, make sure you form it back into a vector with g=g(:);  %%%% YOUR CODE HERE %%%    f = 0;  g = zeros(size(theta));  a = theta*X;  a = [a;zeros(1,size(a,2))];  a = exp(a);  aSum = sum(a);  compareMatrix = 1:10;  compareMatrix = repmat(compareMatrix,1,m);  h=log(a./repmat(aSum,num_classes,1));  judMatrix = abs(compareMatrix - repmat(y,num_classes,1));  A = judMatrix;  A(judMatrix>0) = 0;  A(judMatrix==0) = 1;  B = A*h;  f = -sum(diag(B));  g = -X*(A-a./repmat(aSum,num_classes,1));  g = g(:,1:9);  g=g(:); % make gradient a vector for minFunc
View Code

PCA Whitening(这个看教程就可以了)

%%================================================================%% Step 0a: Load data%  Here we provide the code to load natural image data into x.%  x will be a 784 * 600000 matrix, where the kth column x(:, k) corresponds to%  the raw image data from the kth 12x12 image patch sampled.%  You do not need to change the code below.clear all;close all;clc;x = loadMNISTImages(train-images-idx3-ubyte);figure(name,Raw images);randsel = randi(size(x,2),200,1); % A random selection of samples for visualizationdisplay_network(x(:,randsel));%%================================================================%% Step 0b: Zero-mean the data (by row)%  You can make use of the mean and repmat/bsxfun functions.%%% YOUR CODE HERE %%%xMeanRow = mean(x);x = x-repmat(xMeanRow,size(x,1),1);%%================================================================%% Step 1a: Implement PCA to obtain xRot%  Implement PCA to obtain xRot, the matrix in which the data is expressed%  with respect to the eigenbasis of sigma, which is the matrix U.%%% YOUR CODE HERE %%%xCorr = x*x/size(x,2);[U S V] = svd(xCorr);xRot = U*x;%%================================================================%% Step 1b: Check your implementation of PCA%  The covariance matrix for the data expressed with respect to the basis U%  should be a diagonal matrix with non-zero entries only along the main%  diagonal. We will verify this here.%  Write code to compute the covariance matrix, covar. %  When visualised as an image, you should see a straight line across the%  diagonal (non-zero entries) against a blue background (zero entries).%%% YOUR CODE HERE %%%covar = xRot*xRot/size(xRot,2);% Visualise the covariance matrix. You should see a line across the% diagonal against a blue background.figure(name,Visualisation of covariance matrix);imagesc(covar);%%================================================================%% Step 2: Find k, the number of components to retain%  Write code to determine k, the number of components to retain in order%  to retain at least 99% of the variance.%%% YOUR CODE HERE %%%var = sum(diag(covar));varMin = 0.99*var;varSum = 0;k = 0;A = diag(covar);for i=1:length(A)    varSum = varSum+A(i);    if(varSum>=varMin && k==0)        k = i;    endend%%================================================================%% Step 3: Implement PCA with dimension reduction%  Now that you have found k, you can reduce the dimension of the data by%  discarding the remaining dimensions. In this way, you can represent the%  data in k dimensions instead of the original 144, which will save you%  computational time when running learning algorithms on the reduced%  representation.% %  Following the dimension reduction, invert the PCA transformation to produce %  the matrix xHat, the dimension-reduced data with respect to the original basis.%  Visualise the data and compare it to the raw data. You will observe that%  there is little loss due to throwing away the principal components that%  correspond to dimensions with low variation.%%% YOUR CODE HERE %%%xRot = U*x;xTilde = U(:,1:k)*x;xHat = U*[xTilde;zeros(size(x,1)-k,size(x,2))];% Visualise the data, and compare it to the raw data% You should observe that the raw and processed data are of comparable quality.% For comparison, you may wish to generate a PCA reduced image which% retains only 90% of the variance.figure(name,[PCA processed images ,sprintf((%d / %d dimensions), k, size(x, 1)),‘‘]);display_network(xHat(:,randsel));figure(name,Raw images);display_network(x(:,randsel));%%================================================================%% Step 4a: Implement PCA with whitening and regularisation%  Implement PCA with whitening and regularisation to produce the matrix%  xPCAWhite. epsilon = 1e-1; %%% YOUR CODE HERE %%%xPCAWhite = diag(1./sqrt(diag(S) + epsilon)) * xRot;covar = xPCAWhite*xPCAWhite;%% Step 4b: Check your implementation of PCA whitening %  Check your implementation of PCA whitening with and without regularisation. %  PCA whitening without regularisation results a covariance matrix %  that is equal to the identity matrix. PCA whitening with regularisation%  results in a covariance matrix with diagonal entries starting close to %  1 and gradually becoming smaller. We will verify these properties here.%  Write code to compute the covariance matrix, covar. %%  Without regularisation (set epsilon to 0 or close to 0), %  when visualised as an image, you should see a red line across the%  diagonal (one entries) against a blue background (zero entries).%  With regularisation, you should see a red line that slowly turns%  blue across the diagonal, corresponding to the one entries slowly%  becoming smaller.%%% YOUR CODE HERE %%%% Visualise the covariance matrix. You should see a red line across the% diagonal against a blue background.figure(name,Visualisation of covariance matrix);imagesc(covar);%%================================================================%% Step 5: Implement ZCA whitening%  Now implement ZCA whitening to produce the matrix xZCAWhite. %  Visualise the data and compare it to the raw data. You should observe%  that whitening results in, among other things, enhanced edges.%%% YOUR CODE HERE %%%xZCAWhite = U * diag(1./sqrt(diag(S) + epsilon)) * U * x;% Visualise the data, and compare it to the raw data.% You should observe that the whitened images have enhanced edges.figure(name,ZCA whitened images);display_network(xZCAWhite(:,randsel));figure(name,Raw images);display_network(x(:,randsel));
View Code

 

暂时就这些了,其他的代码看了,写了之后也会写在这里……

UFLDL tutorial 代码分析