UFLDL tutorial 代码分析

首页 > 代码库 > UFLDL tutorial 代码分析

2024-07-28 15:07:46 220人阅读

之前一直没怎么接触过代码，前段时间朋友提起了caffe。本想看看caffe怎么用，无奈自己太渣了，不会用……想起之前也没怎么接触过这方面知识，就从入门开始吧。本文代码来自UFLDL tutorial。

1.函数分析

MATLAB代码，和UFLDL Tutorial对应。代码调用minFunc求解，可以先看第二部分再看此处。

minFunc : uncontrained optimizer using a line search strategy.（注：虽然此处没有表明，但方法只能求解无约束凸优化问题）

函数采用下降方法求解最小值，可以参考凸优化和机器学习第3节的内容，当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。

Inputs : funObj, x0, options, varagin

funObj 提供了代价函数和梯度

x0 迭代过程中的初值

options 函数参数传递

varagin funObj中需要的参数.

Outputs : x, f, exitflag output

x 迭代结果，最小值

    f 最小值处的代价函数的取值
    exitflag 退出时的状态
    output 函数运行时相关信息

输入参数（options）

DerivativeCheck

verbose & verboseI & debug & doPlot 通过DISPLAY控制, 决定运行过程中显示多少信息.

method METHOD控制, 表示采用的下降方法. LS_init，LS_type，LS_interp，LS_multi，Fref，Damped，HessianIter，c2 的取值和采用的方法有关，也能够自己指定（部分）.

其他参数，包括maxFunEvals, maxIter, optTol, progTol, ...可以保持默认或指定

凸优化求解——下降方法

参考凸优化和机器学习第3节的内容，当然Boyd的《凸优化》在时间充裕的情况下显然是更好的选择。

处理流程

（以最速下降法为例，比较简单，主要是其他的我不会……）

变量含义

x: 此时的位置

d: 下降方向

t: 下降步长

1. 预处理

2. SD < NEWTON, Hessian矩阵不需要计算。采用funObj 计算f(代价函数值)和g(x点处的梯度）。

3. 循环，直到迭代次数达到或者最小步长/代价函数差值达到。

对于最速下降法, 下降方向= 负梯度，因此d=-g

之后采用直线搜索策略，找最小步长。默认采用回溯直线搜索方法。

初始化t=1，调用函数ArmijoBacktrack计算即可。

4 . 对结果进行相应的检验和判断。

ArmijoBacktrack

标准的回溯直线搜索方法。参数含义可参见注释

本程序中c1代表alpha（默认值为1e-4），每次循环t=t/2。

2.代码

注：MATLAB 脚本（不是函数）开头最好加上以下几行,清楚工作区，变量，以及关闭图等。

clc;clear all;close all;

Linear Regression

ex1a_linreg.m line 47

theta = minFunc(@linear_regression, theta, options, train.X, train.y);

其中options 是struct，决定minFunc参数。train.x和train.y是输入数据集（训练），linear_regression是编写的函数，输入参数中的theta是初值，输出theta是求得的线性回归的权值。即

$theta = \arg \min \left\{ {{{\left( {thet{a^T}*train.x - train.y} \right)}^2}} \right\}$

minFunc 起到了求解上式的功能。

linear_regression(注释已略去)

function [f,g] = linear_regression(theta, X,y)  f=0;  g=zeros(size(theta));  yEst=theta‘*X;  f=(y-yEst)*(y-yEst)‘;  g=X*(yEst-y)‘;  end

minFunc计算过程中需要得知代价函数和梯度，以上代码提供了这一功能。

Logistic Regression

function [f,g] = logistic_regression(theta, X,y)  %  % Arguments:  %   theta - A column vector containing the parameter values to optimize.  %   X - The examples stored in a matrix.    %       X(i,j) is the i‘th coordinate of the j‘th example.  %   y - The label for each example.  y(j) is the j‘th example‘s label.  %  m=size(X,2);    % initialize objective value and gradient.  f = 0;  g = zeros(size(theta));%%% YOUR CODE HERE %%%  h = 1./(1+exp(-theta‘*X));  f = -y*log(h‘)+(y-1)*log(1-h‘);  g = X*(h-y)‘;

View Code

Softmax Regression（供参考）

function [f,g] = softmax_regression(theta, X,y)  %  % Arguments:  %   theta - A vector containing the parameter values to optimize.  %       In minFunc, theta is reshaped to a long vector.  So we need to  %       resize it to an n-by-(num_classes-1) matrix.  %       Recall that we assume theta(:,num_classes) = 0.  %  %   X - The examples stored in a matrix.    %       X(i,j) is the i‘th coordinate of the j‘th example.  %   y - The label for each example.  y(j) is the j‘th example‘s label.  %  m=size(X,2);  n=size(X,1);  % theta is a vector;  need to reshape to n x num_classes.  theta=reshape(theta, n, []);  num_classes=size(theta,2)+1;    % initialize objective value and gradient.  %  % TODO:  Compute the softmax objective function and gradient using vectorized code.  %        Store the objective function value in ‘f‘, and the gradient in ‘g‘.  %        Before returning g, make sure you form it back into a vector with g=g(:);  %%%% YOUR CODE HERE %%%    f = 0;  g = zeros(size(theta));  a = theta‘*X;  a = [a;zeros(1,size(a,2))];  a = exp(a);  aSum = sum(a);  compareMatrix = 1:10;  compareMatrix = repmat(compareMatrix‘,1,m);  h=log(a./repmat(aSum,num_classes,1));  judMatrix = abs(compareMatrix - repmat(y,num_classes,1));  A = judMatrix;  A(judMatrix>0) = 0;  A(judMatrix==0) = 1;  B = A*h‘;  f = -sum(diag(B));  g = -X*(A-a./repmat(aSum,num_classes,1))‘;  g = g(:,1:9);  g=g(:); % make gradient a vector for minFunc

View Code

PCA Whitening（这个看教程就可以了）

%%================================================================%% Step 0a: Load data%  Here we provide the code to load natural image data into x.%  x will be a 784 * 600000 matrix, where the kth column x(:, k) corresponds to%  the raw image data from the kth 12x12 image patch sampled.%  You do not need to change the code below.clear all;close all;clc;x = loadMNISTImages(‘train-images-idx3-ubyte‘);figure(‘name‘,‘Raw images‘);randsel = randi(size(x,2),200,1); % A random selection of samples for visualizationdisplay_network(x(:,randsel));%%================================================================%% Step 0b: Zero-mean the data (by row)%  You can make use of the mean and repmat/bsxfun functions.%%% YOUR CODE HERE %%%xMeanRow = mean(x);x = x-repmat(xMeanRow,size(x,1),1);%%================================================================%% Step 1a: Implement PCA to obtain xRot%  Implement PCA to obtain xRot, the matrix in which the data is expressed%  with respect to the eigenbasis of sigma, which is the matrix U.%%% YOUR CODE HERE %%%xCorr = x*x‘/size(x,2);[U S V] = svd(xCorr);xRot = U‘*x;%%================================================================%% Step 1b: Check your implementation of PCA%  The covariance matrix for the data expressed with respect to the basis U%  should be a diagonal matrix with non-zero entries only along the main%  diagonal. We will verify this here.%  Write code to compute the covariance matrix, covar. %  When visualised as an image, you should see a straight line across the%  diagonal (non-zero entries) against a blue background (zero entries).%%% YOUR CODE HERE %%%covar = xRot*xRot‘/size(xRot,2);% Visualise the covariance matrix. You should see a line across the% diagonal against a blue background.figure(‘name‘,‘Visualisation of covariance matrix‘);imagesc(covar);%%================================================================%% Step 2: Find k, the number of components to retain%  Write code to determine k, the number of components to retain in order%  to retain at least 99% of the variance.%%% YOUR CODE HERE %%%var = sum(diag(covar));varMin = 0.99*var;varSum = 0;k = 0;A = diag(covar);for i=1:length(A)    varSum = varSum+A(i);    if(varSum>=varMin && k==0)        k = i;    endend%%================================================================%% Step 3: Implement PCA with dimension reduction%  Now that you have found k, you can reduce the dimension of the data by%  discarding the remaining dimensions. In this way, you can represent the%  data in k dimensions instead of the original 144, which will save you%  computational time when running learning algorithms on the reduced%  representation.% %  Following the dimension reduction, invert the PCA transformation to produce %  the matrix xHat, the dimension-reduced data with respect to the original basis.%  Visualise the data and compare it to the raw data. You will observe that%  there is little loss due to throwing away the principal components that%  correspond to dimensions with low variation.%%% YOUR CODE HERE %%%xRot = U‘*x;xTilde = U(:,1:k)‘*x;xHat = U*[xTilde;zeros(size(x,1)-k,size(x,2))];% Visualise the data, and compare it to the raw data% You should observe that the raw and processed data are of comparable quality.% For comparison, you may wish to generate a PCA reduced image which% retains only 90% of the variance.figure(‘name‘,[‘PCA processed images ‘,sprintf(‘(%d / %d dimensions)‘, k, size(x, 1)),‘‘]);display_network(xHat(:,randsel));figure(‘name‘,‘Raw images‘);display_network(x(:,randsel));%%================================================================%% Step 4a: Implement PCA with whitening and regularisation%  Implement PCA with whitening and regularisation to produce the matrix%  xPCAWhite. epsilon = 1e-1; %%% YOUR CODE HERE %%%xPCAWhite = diag(1./sqrt(diag(S) + epsilon)) * xRot;covar = xPCAWhite*xPCAWhite‘;%% Step 4b: Check your implementation of PCA whitening %  Check your implementation of PCA whitening with and without regularisation. %  PCA whitening without regularisation results a covariance matrix %  that is equal to the identity matrix. PCA whitening with regularisation%  results in a covariance matrix with diagonal entries starting close to %  1 and gradually becoming smaller. We will verify these properties here.%  Write code to compute the covariance matrix, covar. %%  Without regularisation (set epsilon to 0 or close to 0), %  when visualised as an image, you should see a red line across the%  diagonal (one entries) against a blue background (zero entries).%  With regularisation, you should see a red line that slowly turns%  blue across the diagonal, corresponding to the one entries slowly%  becoming smaller.%%% YOUR CODE HERE %%%% Visualise the covariance matrix. You should see a red line across the% diagonal against a blue background.figure(‘name‘,‘Visualisation of covariance matrix‘);imagesc(covar);%%================================================================%% Step 5: Implement ZCA whitening%  Now implement ZCA whitening to produce the matrix xZCAWhite. %  Visualise the data and compare it to the raw data. You should observe%  that whitening results in, among other things, enhanced edges.%%% YOUR CODE HERE %%%xZCAWhite = U * diag(1./sqrt(diag(S) + epsilon)) * U‘ * x;% Visualise the data, and compare it to the raw data.% You should observe that the whitened images have enhanced edges.figure(‘name‘,‘ZCA whitened images‘);display_network(xZCAWhite(:,randsel));figure(‘name‘,‘Raw images‘);display_network(x(:,randsel));

View Code

暂时就这些了，其他的代码看了，写了之后也会写在这里……

UFLDL tutorial 代码分析

声明：以上内容来自用户投稿及互联网公开渠道收集整理发布，本网站不拥有所有权，未作人工编辑处理，也不承担相关法律责任，若内容有误或涉及侵权可进行投诉：投诉/举报工作人员会在5个工作日内联系你，一经查实，本站将立刻删除涉嫌侵权内容。

联系
我们

首页 > 代码库 > UFLDL tutorial 代码分析