首页 > 代码库 > 翻译文章
翻译文章
王佳佳(主要负责翻译) 张月娥(主要负责神经网络的编辑) 丁选(主要负责向量机的编辑)
6.
Classification (II) –Neural Network and SVM
分类—神经网络与支持向量机
Introduction
Most research has shown that support vector machines (SVM) and neural networks (NN) are
powerful classification tools, which can be applied to several different areas. Unlike tree-based
or probabilistic-based methods that were mentioned in the previous chapter, the process of
how support vector machines and neural networks transform from input to output is less clear
and can be hard to interpret. As a result, both support vector machines and neural networks
are referred to as black box methods.
大多数研究表明,支持向量机(SVM)和神经网络(NN)是强大的分类工具,可以应用于不同的领域。不同于基于树的或基于概率的方法,在前面的章节中提到,如何支持向量机和神经网络从输入到输出的过程是不太清楚,可以很难解释。因此,支持向量机和神经网络被称为黑箱方法。
The development of a neural network is inspired by human brain activities. As such, this type
of network is a computational model that mimics the pattern of the human mind. In contrast
to this, support vector machines first map input data into a high dimension feature space
defined by the kernel function, and find the optimum hyperplane that separates the training
data by the maximum margin. In short, we can think of support vector machines as a linear
algorithm in a high dimensional space.
神经网络的发展灵感来自人类的大脑活动。因此,这种类型的网络是模仿人类思维模式的计算模型。与此相反,支持向量机RST地图输入数据到一个高维特征空间的内核函数,和ND的最佳超平面分离的训练数据的最大保证金。总之,我们可以把支持向量机看成是高维空间中的线性算法。
Both these methods have advantages and disadvantages in solving classification problems.
For example, support vector machine solutions are the global optimum, while neural networks
may suffer from multiple local optimums. Thus, choosing between either depends on the
characteristics of the dataset source. In this chapter, we will illustrate the following:
这些方法在求解分类问题的优缺点。例如,支持向量机的解是全局最优解,而神经网络可能有多个局部最优。因此,两者之间的选择取决于数据集源的特性。在本章中,我们将说明以下:
How to train a support vector machine
如何支持训练向量机
Observing how the choice of cost can affect the SVM classifier
观察成本选择对SVM分类器的影响
Visualizing the SVM fit
支持向量机visualizing Fit
Predicting the labels of a testing dataset based on the model trained by SVM
基于SVM训练模型的测试数据集标签预测
Tuning the SVM
优化的纸质向量机
In the neural network section, we will cover:
在神经网络部分,我们将覆盖:
How to train a neural network
如何训练神经网络
How to visualize a neural network model
如何可视化神经网络模型
Predicting the labels of a testing dataset based on a model trained by neuralnet
预测基于训练的神经网络模型的测试数
Finally, we will show how to train a neural network with nnet , and how to use it to
predict the labels of a testing dataset
最后,我们将展示如何用神经网络训练一个神经网络,以及如何使用它来预测的测试数据集的标签
Classifying data with a support vector machine
用支持向量机分类数据。
The two most well known and popular support vector machine tools are libsvm and
SVMLite . For R users, you can find the implementation of libsvm in the e1071 package and
SVMLite in the klaR package. Therefore, you can use the implemented function of these
two packages to train support vector machines. In this recipe, we will focus on using the svm
function (the libsvm implemented version) from the e1071 package to train a support vector
machine based on the telecom customer churn data training dataset.
两个最知名和流行的支持向量机工具LIBSVM和svmlite。R用户,你可以和LIBSVM执行在e1071包和svmlite在该包。因此,可以使用这两个包的实现函数来训练支持向量机。在这个食谱中,我们将专注于利用支持向量机的功能(libsvm的版本)从e1071包基于电信客户流失数据训练集训练支持向量机。
Getting ready
In this recipe, we will continue to use the telecom churn dataset as the input data source to
train the support vector machine. For those who have not prepared the dataset, please refer
to Chapter 5, Classification (I) – Tree, Lazy, and Probabilistic, for details.
在这个配方中,我们将继续使用电信流失数据集作为输入数据源来训练支持向量机。对于那些没有准备的数据集,请参阅第5章,分类(I)-树,懒惰,和概率,详情。
How to do it...
Perform the following steps to train the SVM:
执行以下步骤来训练支持向量机:
- Load the e1071 package:
负荷e1071包:
> library(e1071)
2. Train the support vector machine using the svm function with trainset as the input dataset, and use churn as the classification category:
采用动车组作为输入数据集的支持向量机的功能训练支持向量机,并利用流失作为分类的类别:
> model = svm(churn~., data = http://www.mamicode.com/trainset, kernel="radial", cost=1,
gamma = 1/ncol(trainset))
3.Finally, you can obtain overall information about the built model with summary :
最后,您可以通过汇总获取所建模型的全部信息。
> summary(model)
Call:
svm(formula = churn ~ ., data = http://www.mamicode.com/trainset, kernel ="radial", cost
= 1, gamma = 1/ncol(trainset))
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 1
gamma: 0.05882353
Number of Support Vectors: 691
( 394 297 )
Number of Classes: 2
Levels:
yes no
How it works...
The support vector machine constructs a hyperplane (or set of hyperplanes) that maximize the margin width between two classes in a high dimensional space. In these, the cases that define the hyperplane are support vectors, as shown in the following figure:
支持向量机建立一个超平面(或平面),最大限度地在高维空间中的两个类之间的边距宽度。在这些定义的超平面的支持向量的情况下,如下图所示:
Figure 1: Support Vector Machine
Support vector machine starts from constructing a hyperplane that maximizes the margin width. Then, it extends the definition to a nonlinear separable problem. Lastly, it maps the data to a high dimensional space where the data can be more easily separated with a linear boundary.
支持向量机从构造一个超平面,最大限度地边缘宽度。然后,它扩展了定义的非线性可分的问题。最后,它的数据映射到一个高维空间的数据可以更容易地与线性边界分离。
The advantage of using SVM is that it builds a highly accurate model through an engineering problem-oriented kernel. Also, it makes use of the regularization term to avoid over-fitting. It also does not suffer from local optimal and multicollinearity. The main limitation of SVM is its speed and size in the training and testing time. Therefore, it is not suitable or efficient enough to construct classification models for data that is large in size. Also, since it is hard to interpret SVM, how does the determination of the kernel take place? Regularization is another problem that we need tackle.
使用支持向量机的优点是,它建立了一个高度精确的模型,通过工程问题为导向的内核。同时,利用正则化项避免过度拟合。它也不受局部最优和多重共线性。支持向量机的主要缺点是它的速度和大小在训练和测试时间。因此,这是不适当的或有效的足够的构造,尺寸大的数据分类模型。此外,由于它是难以解释SVM,如何确定的内核发生?正则化是我们需要解决的另一个问题。
In this recipe, we continue to use the telecom churn dataset as our example data source.We begin training a support vector machine using libsvm provided in the e1071 package.Within the training function, svm , one can specify the kernel function, cost, and the gamma function. For the kernel argument, the default value is radial, and one can specify the kernel to a linear, polynomial, radial basis, and sigmoid. As for the gamma argument, the default value is equal to (1/data dimension), and it controls the shape of the separating hyperplane. Increasing the gamma argument usually increases the number of support vectors.
在这个食谱中,我们继续使用电信客户流失数据集作为我们的示例数据源。我们开始使用在e1071包LIBSVM支持向量机训练。在训练函数中,支持向量机可以指定核函数、代价和Gamma函数。对于内核参数,默认值是径向的,可以指定内核的线性,多项式,径向基,乙状结肠。至于伽玛参数,默认值等于(1 /数据维数),并且它控制分离超平面的形状。增加伽玛参数通常会增加支持向量的数目。
As for the cost, the default value is set to 1, which indicates that the regularization term is constant, and the larger the value, the smaller the margin is. We will discuss more on how the cost can affect the SVM classifier in the next recipe. Once the support vector machine is built, the summary function can be used to obtain information, such as calls, parameters, number of classes, and the types of label.
至于成本,默认值被设置为1,这表明正则化项是恒定的,和较大的值,较小的保证金是。我们将讨论更多的费用如何能影响SVM分类器在接下来的食谱。一旦建立支持向量机,总结函数可以用来获取信息,如调用,参数,类的数量,和类型的标签。
See also
Another popular support vector machine tool is SVMLight . Unlike the e1071 package, which provides the full implementation of libsvm , the klaR package simply provides an interface to SVMLight only. To use SVMLight , one can perform the following steps:
另一个流行的支持向量机的工具是SVMLight。不像e1071包,它提供了libsvm的全面实施,这是包只提供了一个接口svmLight只。使用SVMlight,可以执行以下步骤:
- Install the klaR package:
安装该包:
> install.packages("klaR")
> library(klaR)
2.Download the SVMLight source code and binary for your platform from http://svmlight.joachims.org/ . For example, if your guest OS is Windows 64-bit, you should downloadthefilefromhttp://download.joachims.org/svm_light/current/svm_light_windows64.zip
2。下载SVMlight的源代码和二进制你从HTTP:/ / svmLight平台。Joachims。org /。例如,如果你的操作系统是Windows 64位,你应该下载文件从http://download.joachims.org/svm_light/电流/ svm_light_windows64.zip。
3. Then, you should unzip the file and put the workable binary in the working directory; you may check your working directory by using the getwd function:
3.然后,你应该将乐把可行的二进制在工作目录中;你可以通过使用getwd功能检查你的工作目录:
> getwd()
4. Train the support vector machine using the svmlight function:
4。使用svmLight功能训练支持向量机:
> model.light = svmlight(churn~., data = http://www.mamicode.com/trainset,
kernel="radial", cost=1, gamma = 1/ncol(trainset))
Choosing the cost of a support vector machine
支持向量机的成本选择。
The support vector machines create an optimum hyperplane that separates the training data by the maximum margin. However, sometimes we would like to allow some misclassifications while separating categories. The SVM model has a cost function, which controls training errors and margins. For example, a small cost creates a large margin (a soft margin) and allows more misclassifications. On the other hand, a large cost creates a narrow margin (a hard margin) and permits fewer misclassifications. In this recipe, we will illustrate how the large and small cost will affect the SVM classifier.
支持向量机创建一个最佳的超平面分离的训练数据的最大保证金。然而,有时我们想同时分离类别允许一些误分类训练样本的阳离子。SVM模型具有成本函数,它控制训练错误和利润率。例如,一个小成本创造大利润(软边),允许更多的误分类训练样本的阳离子。另一方面,大的成本创造了一个狭窄的边缘(硬边缘),允许误分类训练样本少的阳离子。在这个食谱中,我们将说明如何在大的和小的成本会影响SVM分类器。
Getting ready
In this recipe, we will use the iris dataset as our example data source.
在这个配方中,我们将使用IRIS数据集作为我们的示例数据源。
How to do it...
Perform the following steps to generate two different classification examples with different costs:
执行以下步骤来产生不同的分类有不同的成本的例子:
- Subset the iris dataset with columns named as Sepal.Length , Sepal.Width ,Species , with species in setosa and virginica :
1。子集列命名为萼片的Iris数据集。长,萼片。宽度、物种,在粗糙和锦葵属:
> iris.subset = subset(iris, select=c("Sepal.Length", "Sepal.
Width", "Species"), Species %in% c("setosa","virginica"))
- Then, you can generate a scatter plot with Sepal.Length as the x-axis and the Sepal.Width as the y-axis:
2.然后,您可以生成与萼片散点图。长度为X轴和Y轴宽度为萼片:
> plot(x=iris.subset$Sepal.Length,y=iris.subset$Sepal.Width,
col=iris.subset$Species, pch=19)
Figure 2: Scatter plot of Sepal.Length and Sepal.Width with subset of iris dataset
- Next, you can train SVM based on iris.subset with the cost equal to 1:
3.接下来,你可以训练SVM基于成本等于1 iris.subset:
> svm.model = svm(Species ~ ., data=http://www.mamicode.com/iris.subset, kernel=‘linear‘,
cost=1, scale=FALSE)
- Then, we can circle the support vector with blue circles:
4。然后,我们可以用蓝色圆圈支持向量:
> points(iris.subset[svm.model$index,c(1,2)],col="blue",cex=2)
Figure 3: Circling support vectors with blue ring
- Lastly, we can add a separation line on the plot:
5.最后,我们可以在图上添加分隔线:
> w = t(svm.model$coefs) %*% svm.model$SV
> b = -svm.model$rho
> abline(a=-b/w[1,2], b=-w[1,1]/w[1,2], col="red", lty=5)
- In addition to this, we create another SVM classifier where cost = 10,000 :
6.除此之外,我们创造了一个SVM分类器在成本:
> plot(x=iris.subset$Sepal.Length,y=iris.subset$Sepal.Width,
col=iris.subset$Species, pch=19)
> svm.model = svm(Species ~ ., data=http://www.mamicode.com/iris.subset, type=‘C-
classification‘, kernel=‘linear‘, cost=10000, scale=FALSE)
> points(iris.subset[svm.model$index,c(1,2)],col="blue",cex=2)
> w = t(svm.model$coefs) %*% svm.model$SV
> b = -svm.model$rho
> abline(a=-b/w[1,2], b=-w[1,1]/w[1,2], col="red", lty=5)
Figure 5: A classification example with large cost
How it works...
In this recipe, we demonstrate how different costs can affect the SVM classifier. First, we create an iris subset with the columns, Sepal.Length , Sepal.Width , and Species containing the species, setosa and virginica . Then, in order to create a soft margin and allow some misclassification, we use an SVM with small cost (where cost = 1 ) to train the support of the
vector machine. Next, we circle the support vectors with blue circles and add the separation line. As per Figure 5, one of the green points ( virginica ) is misclassified (it is classified to setosa ) to the other side of the separation line due to the choice of the small cost.
在这个配方中,我们演示了如何不同的成本可以影响SVM分类器。首先,我们创建一个列,萼片虹膜的子集。长度,宽度,和萼片。两种物种,粗糙和锦葵。然后,为了创造一个软边缘和允许一些误判,我们用SVM和一些小的成本(wherecost = 1)去练习支持。接下来,我们将支持向量与蓝色圆圈和添加分离线。如图5,一个绿色的点(锦葵)是错误的(这是分类到setosa)的分离线由于小成本的选择对方。
In addition to this, we would like to determine how a large cost can affect the SVM classifier. Therefore, we choose a large cost (where cost = 10,000 ). From Figure 5, we can see that the margin created is narrow (a hard margin) and no misclassification cases are present. As a result, the two examples show that the choice of different costs may affect the margin created and also affect the possibilities of misclassification.
除此之外,我们要确定一个大的成本会影响SVM分类器。因此,我们选择一个大的成本(成本= 10000)。从图5,我们可以看到边缘创建窄(硬边缘)和阳离子病例无误分类训练样本。作为一个结果,这两个例子表明,不同成本的选择可能影响利润创造和影响误分类训练样本的阳离子的可能性。
See also
The idea of soft margin, which allows misclassified examples, was suggested by Corinna Cortes and Vladimir N. Vapnik in 1995 in the following paper: Cortes, C., and Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
软边缘的概念,它允许误分类训练样本ED的例子,建议由Corinna Cortes和Vladimir N. Vapnik在1995在以下论文:科尔特斯,C,和Vapnik V(1995)。支持向量网。机器学习,20(3),273-297。
Visualizing an SVM fit
格式化SVM
To visualize the built model, one can first use the plot function to generate a scatter plot of data input and the SVM fit. In this plot, support vectors and classes are highlighted through the color symbol. In addition to this, one can draw a contour filled plot of the class regions to easily identify misclassified samples from the plot.
可视化的内置模型,可以RST使用的情节功能,以产生一个散点图的数据输入和SVM T.在这个情节,支持向量和类突出通过颜色符号。除此之外,一个可以画出一个轮廓填充情节类的地区很容易识别的误分类训练样本ED样本图。
Getting ready
In this recipe, we will use two datasets: the iris dataset and the telecom churn dataset. For the telecom churn dataset, one needs to have completed the previous recipe by training a support vector machine with SVM, and to have saved the SVM fit model.
在这个配方中,我们将使用两个数据集:IRIS数据集和电信流失数据集。对于电信流失数据集,需要一个支持向量机支持SVM的前一个配方完成,并保存了SVM T模型。
How to do it...
Perform the following steps to visualize the SVM fit object:
执行以下步骤来可视化SVM t对象:
- Use SVM to train the support vector machine based on the iris dataset, and use the plot function to visualize the fitted model:
1.使用SVM训练支持向量机的虹膜数据集的基础上,使用的情节功能可视化拟合模型.
> data(iris)
> model.iris = svm(Species~., iris)
> plot(model.iris, iris, Petal.Width ~ Petal.Length, slice =
list(Sepal.Width = 3, Sepal.Length = 4))
- Visualize the SVM fit object, model , using the plot function with the dimensions of total_day_minutes and total_intl_charge :
2。将支持向量机T对象,模型,采用total_day_minutes和total_intl_charge尺寸的绘图功能:
> plot(model, trainset, total_day_minutes ~ total_intl_charge)
Figure 7: The SVM classification plot of trained SVM fit based on churn dataset
How it works...
In this recipe, we demonstrate how to use the plot function to visualize the SVM fit. In the first plot, we train a support vector machine using the iris dataset. Then, we use the plot function to visualize the fitted SVM.
在这个食谱中,我们演示了如何使用的情节功能可视化的SVM T.在RST情节,我们训练的支持向量机使用的虹膜数据集。然后,我们使用的绘图功能,可视化的配置支持向量机。
In the argument list, we specify the fitted model in the first argument and the dataset (this should be the same data used to build the model) as the second parameter. The third parameter indicates the dimension used to generate the classification plot. By default, the plot function can only generate a scatter plot based on two dimensions (for the x-axis and y-axis). Therefore, we select the variables, Petal.Length and Petal.Width as the two dimensions to generate the scatter plot.
在参数列表中,我们指定的第一个参数拟合模型和数据集(这应该是用来建立模型相同的数据)作为第二个参数。第三个参数表示用于生成分类图的尺寸。默认情况下,绘图函数只能生成基于二维(x轴和y轴)的散点图。因此,我们选择的变量,花瓣,长度和花瓣的宽度作为两个维度产生的散射图。
From Figure 6, we find Petal.Length assigned to the x-axis, Petal.Width assigned to the y-axis, and data points with X and O symbols scattered on the plot. Within the scatter plot, the X symbol shows the support vector and the O symbol represents the data points. These two symbols can be altered through the configuration of the svSymbol and dataSymbol options. Both the support vectors and true classes are highlighted and colored depending on their label (green refers to viginica, red refers to versicolor, and black refers to setosa). The last argument, slice , is set when there are more than two variables. Therefore, in this example, we use the additional variables, Sepal.width and Sepal.length , by assigning a constant of 3 and 4.
从图6中,我们看到第二个花瓣,长度被分配给X轴,花瓣,被分配到y轴的宽度,以及x和O符号在图上散布的数据点。在散布图中,x符号表示支持向量,O符号表示数据点。这两个符号可以通过改变配置的svsymbol和符号数据选项。无论是支持向量和真正的类是突出和颜色取决于他们的标签(绿色是指viginica,红指杂色,黑色指setosa)。当有两个以上变量时,最后一个参数是切片。因此,在这个例子中,我们使用额外的变量,Sepal.width和sepal.length,通过分配3和4不变。
Next, we take the same approach to draw the SVM fit based on customer churn data. In this example, we use total_day_minutes and total_intl_charge as the two dimensions used to plot the scatterplot. As per Figure 7, the support vectors and data points in red and black are scattered closely together in the central region of the plot, and there is no simple way to separate them.
接下来,我们采取相同的方法来绘制基于客户流失数据的SVM T。在这个例子中,我们使用total_day_minutes和total_intl_charge作为两个维度用于绘制散点图。如图7所示,红色和黑色的支持向量和数据点在图的中心区域紧密地聚集在一起,没有简单的方法来区分它们。
See also
There are other parameters , such as fill , grid , symbolPalette , and so on, that can be configured to change the layout of the plot. You can use the help function to view the following document for further information:
还有其他的参数,如填充、网格、符号调色板,等等,可以配置来改变剧情的布局。您可以使用“帮助”功能查看下面的文档以获取更多
> ?svm.plot
Predicting labels based on a model trained by a support vector machine
基于支持向量机训练模型预测标签
In the previous recipe, we trained an SVM based on the training dataset. The training process finds the optimum hyperplane that separates the training data by the maximum margin. We can then utilize the SVM fit to predict the label (category) of new observations. In this recipe, we will demonstrate how to use the predict function to predict values based on a model trained by SVM.
在以前的配方中,我们训练了SVM的训练数据集的基础上。训练过程的最佳超平面分离训练数据的最大保证金。然后,我们可以利用SVM t预测标签(类别)的新的意见。在这个食谱中,我们将演示如何使用预测函数预测模型的基础上SVM训练的值。
Getting ready
You need to have completed the previous recipe by generating a fitted SVM, and save the fitted model in model.
你需要通过生成一个安装SVM完成之前的食谱,并保存在模型拟合模型。
How to do it...
Perform the following steps to predict the labels of the testing dataset:
执行下列步骤来预测测试数据集的标签:
- Predict the label of the testing dataset based on the fitted SVM and attributes of the testing dataset:
1。基于SVM的预测拟合和属性的测试数据测试数据集的标签:
> svm.pred = predict(model, testset[, !names(testset) %in%
c("churn")])
- Then, you can use the table function to generate a classification table with the prediction result and labels of the testing dataset:
2。然后,您可以使用表函数生成的预测结果和测试数据集的标签分类表:
> svm.table=table(svm.pred, testset$churn)
> svm.table
svm.pred yes no
yes 70 12
no 71 865
- Next, you can use classAgreement to calculate coefficients compared to the classification agreement:
3.接下来,你可以使用类协议计算系数相对于分类协议:
> classAgreement(svm.table)
$diag
[1] 0.9184676
$kappa
[1] 0.5855903
$rand
[1] 0.850083
$crand
[1] 0.5260472
4. Now, you can use confusionMatrix to measure the prediction performance based on the classification table:
4。现在,你可以使用混淆矩阵来衡量基于分类表的预测性能:
> library(caret)
> confusionMatrix(svm.table)
Confusion Matrix and Statistics
svm.pred yes no
yes 70 12
no 71 865
Accuracy : 0.9185
95% CI : (0.8999, 0.9345)
No Information Rate : 0.8615
P-Value [Acc > NIR] : 1.251e-08
Kappa : 0.5856
Mcnemar‘s Test P-Value : 1.936e-10
Sensitivity : 0.49645
Specificity : 0.98632
Pos Pred Value : 0.85366
Neg Pred Value : 0.92415
Prevalence : 0.13851
Detection Rate : 0.06876
Detection Prevalence : 0.08055
Balanced Accuracy : 0.74139
‘Positive‘ Class : yes
How it works...
In this recipe, we first used the predict function to obtain the predicted labels of the testing dataset. Next, we used the table function to generate the classification table based on the predicted labels of the testing dataset. So far, the evaluation procedure is very similar to the evaluation process mentioned in the previous chapter.
在这个配方中,我们首先使用的预测功能,以获得预测的标签的测试数据集。接下来,我们用表功能的基础上产生的测试数据集的预测标签的分类表。到目前为止,评估过程非常类似于前面章节提到的评估过程。
We then introduced a new function, classAgreement , which computes several coefficients of agreement between the columns and rows of a two-way contingency table. The coefficients include diag, kappa, rand, and crand. The diag coefficient represents the percentage of data points in the main diagonal of the classification table, kappa refers to diag , which is corrected for an agreement by a change (the probability of random agreements), rand represents the Rand index, which measures the similarity between two data clusters, and crand indicates the Rand index, which is adjusted for the chance grouping of elements.
然后介绍了一个新的函数,类的协议,计算列和一个双向列联表的行数系数之间的协议。该系数包括诊断,kappa,兰德,和大。诊断系数代表的分类表的主对角线上的数据点的百分比,Kappa指诊断,这是通过改变协议(协议修正随机概率),兰德代表兰德指数,衡量两个数据簇之间的相似度,并和表明Rand Index,这是调整元素的机会分组。
Finally, we used confusionMatrix from the caret package to measure the performance of the classification model. The accuracy of 0.9185 shows that the trained support vector machine can correctly classify most of the observations. However, accuracy alone is not a good measurement of a classification model. One should also reference sensitivity and specificity.
最后,我们用confusionmatrix从符号包测量性能的分类模型。0.9185的精度表明,训练有素的支持向量机可以正确分类的大部分意见。然而,一个人并不是一个很好的测量精度的一个分类模型。还应参考敏感性和具体城市。
There‘s more...
Besides using SVM to predict the category of new observations, you can use SVM to predict continuous values. In other words, one can use SVM to perform regression analysis.
除了使用SVM来预测新的观测类别,你可以使用SVM预测连续值。换句话说,可以使用SVM进行回归分析。
In the following example, we will show how to perform a simple regression prediction based on a fitted SVM with the type specified as eps-regression.
在下面的例子中,我们将展示如何执行一个基于支持向量机的安装类型指定为EPS回归简单回归预测。
Perform the following steps to train a regression model with SVM:
用支持向量机训练回归模型:
- Train a support vector machine based on a Quartet dataset:
1。基于四元数据集的支持向量机训练:
> library(car)
> data(Quartet)
> model.regression = svm(Quartet$y1~Quartet$x,type="eps-regression")
2. Use the predict function to obtain prediction results:
2。利用预测函数得到预测结果:
> predict.y = predict(model.regression, Quartet$x)
> predict.y
1 2 3 4 5 6 7
8
8.196894 7.152946 8.807471 7.713099 8.533578 8.774046 6.186349
5.763689
9 10 11
8.726925 6.621373 5.882946
3. Plot the predicted points as squares and the training data points as circles on the same plot:
3.绘制预测点作为广场和训练数据点为圆在同一地块上:
> plot(Quartet$x, Quartet$y1, pch=19)
> points(Quartet$x, predict.y, pch=15, col="red")
Tuning a support vector machine
调整支持向量机
Besides using different feature sets and the kernel function in support vector machines, one
trick that you can use to tune its performance is to adjust the gamma and cost configured in
the argument. One possible approach to test the performance of different gamma and cost
combination values is to write a for loop to generate all the combinations of gamma and
cost as inputs to train different support vector machines. Fortunately, SVM provides a tuning
function, tune.svm , which makes the tuning much easier. In this recipe, we will demonstrate
how to tune a support vector machine through the use of tune.svm
除了支持向量机中使用不同的功能集和核函数,一个技巧,你可以使用来调整其性能调整伽玛和成本配置的参数。一种可能的方法来测试不同的伽玛和成本的组合值的性能是写一个循环产生伽玛和所有的组合成本投入不同的支持向量机训练。幸运的是,SVM提供调谐功能,tune.svm,使调整更加容易。在这个食谱中,我们将演示如何调整支持向量机通过使用tune.svm。
Getting ready
You need to have completed the previous recipe by preparing a training dataset, trainset .
你需要准备一个训练数据集完成之前的食谱,动车组。
How to do it...
Perform the following steps to tune the support vector machine:
执行以下步骤来调整支持向量机:
- First, tune the support vector machine using tune.svm :
1。首先,利用tune.svm调整支持向量机:
> tuned = tune.svm(churn~., data = http://www.mamicode.com/trainset, gamma = 10^(-6:-1),
cost = 10^(1:2))
2. Next, you can use the summary function to obtain the tuning result:
> summary(tuned)
Parameter tuning of ‘svm‘:
- sampling method: 10-fold cross validation
- best parameters:
gamma cost
0.01 100
- best performance: 0.08077885
- Detailed performance results:
gamma cost error dispersion
1 1e-06 10 0.14774780 0.02399512
2 1e-05 10 0.14774780 0.02399512 1e-04 10 0.14774780 0.02399512
4 1e-03 10 0.14774780 0.02399512
5 1e-02 10 0.09245223 0.02046032
6 1e-01 10 0.09202306 0.01938475
7 1e-06 100 0.14774780 0.02399512
8 1e-05 100 0.14774780 0.02399512
9 1e-04 100 0.14774780 0.02399512
10 1e-03 100 0.11794484 0.02368343
11 1e-02 100 0.08077885 0.01858195
12 1e-01 100 0.12356135 0.01661508
3. After retrieving the best performance parameter from tuning the result, you can
retrain the support vector machine with the best performance parameter:
3.从调整结果检索的最佳性能参数后,你可以对支持向量机的最佳性能参数:
> model.tuned = svm(churn~., data = http://www.mamicode.com/trainset, gamma = tuned$best.
parameters$gamma, cost = tuned$best.parameters$cost)
> summary(model.tuned)
Call:
svm(formula = churn ~ ., data = http://www.mamicode.com/trainset, gamma = 10^-2, cost =
100)
Parameters:
SVM-Type: C-classification
SVM-Kernel: radial
cost: 100
gamma: 0.01
Number of Support Vectors: 547
( 304 243 )
Number of Classes: 2
Levels:
yes no
- Then, you can use the predict function to predict labels based on the fitted SVM:
4。然后,您可以使用预测功能预测基于SVM的标签设置:
> svm.tuned.pred = predict(model.tuned, testset[, !names(testset)
%in% c("churn")])
5. Next, generate a classification table based on the predicted and original labels of the
testing dataset:
5。下一步,生成一个基于预测和测试数据集的原始标签分类表:
> svm.tuned.table=table(svm.tuned.pred, testset$churn)
> svm.tuned.table
svm.tuned.pred yes no
yes 95 24
no 46 853
- Also, generate a class agreement to measure the performance:
此外,生成一个类协议来衡量性能:
> classAgreement(svm.tuned.table)
$diag
[1] 0.9312377
$kappa
[1] 0.691678
$rand
[1] 0.871806
$crand
[1] 0.6303615
7. Finally, you can use a confusion matrix to measure the performance of the
retrained model:
7。最后,你可以使用混淆矩阵测量进行模型的性能:
> confusionMatrix(svm.tuned.table)
Confusion Matrix and Statistics
svm.tuned.pred yes no
yes 95 24
no 46 853
Accuracy : 0.9312
95% CI : (0.9139, 0.946)
No Information Rate : 0.8615
P-Value [Acc > NIR] : 1.56e-12
Kappa : 0.6917
Mcnemar‘s Test P-Value : 0.01207
Sensitivity : 0.67376
Specificity : 0.97263
Pos Pred Value : 0.79832
Neg Pred Value : 0.94883
Prevalence : 0.13851
Detection Rate : 0.09332
Detection Prevalence : 0.11690
Balanced Accuracy : 0.82320
‘Positive‘ Class : yes
How it works...
To tune the support vector machine, you can use a trial and error method to find the best
gamma and cost parameters. In other words, one has to generate a variety of combinations of
gamma and cost for the purpose of training different support vector machines.
要调整支持向量机,您可以使用一个试验和错误的方法来最好的伽玛和成本参数。换言之,要产生不同的支持向量机的目的,必须产生各种伽玛和成本的组合。
In this example, we generate different gamma values from 10^-6 to 10^-1, and cost with a
value of either 10 or 100. Therefore, you can use the tuning function, svm.tune , to generate
12 sets of parameters. The function then makes 10 cross-validations and outputs the error
dispersion of each combination. As a result, the combination with the least error dispersion
is regarded as the best parameter set. From the summary table, we found that gamma with
a value of 0.01 and cost with a value of 100 are the best parameters for the SVM fit.
在这个例子中,我们产生不同的伽玛值从10 - 6到10 - - 1,和成本的值为10或。因此,你可以使用调谐功能,svm.tune,产生12组参数。该功能使10交叉验证和输出每个组合的误差分散。其结果是,与最小误差色散的组合被视为最佳参数集。从汇总表中,我们发现,伽玛值0.01和成本的值为100的SVM T的最佳参数。
After obtaining the best parameters, we can then train a new support vector machine with
gamma equal to 0.01 and cost equal to 100. Additionally, we can obtain a classification
table based on the predicted labels and labels of the testing dataset. We can also obtain a
confusion matrix from the classification table. From the output of the confusion matrix, you
can determine the accuracy of the newly trained model in comparison to the original model.
获得最佳参数后,我们可以训练一个新的支持向量机伽玛等于0.01,成本等于100。此外,我们可以得到一个基于预测的标签和标签的测试数据集的分类表。我们还可以从分类表中获取一个混淆矩阵。从混淆矩阵的输出,您可以确定新的训练有素的模型相比,原来的模型的准确性。
See also
f For more information about how to tune SVM with svm.tune , you can use the help
function to access this document:
有关如何优化支持向量机的svm.tune的更多信息,您可以使用帮助功能来访问此文件:
> ?svm.tune
Training a neural network with neuralnet
用神经网络神经网络的训练
The neural network is constructed with an interconnected group of nodes, which involves the
input, connected weights, processing element, and output. Neural networks can be applied to
many areas, such as classification, clustering, and prediction. To train a neural network in R,
you can use neuralnet, which is built to train multilayer perceptron in the context of regression
analysis, and contains many flexible functions to train forward neural networks. In this recipe,
we will introduce how to use neuralnet to train a neural network.
神经网络的构建与一组相互连接的节点,其中涉及的输入,连接的权重,处理单元,和输出。神经网络可以应用到许多领域,如分类、聚类、预测。在训练一个神经网络,可以使用神经网络,它是建立在回归分析的语境训练多层感知器,并包含了许多灵活的功能训练前馈神经网络。在这个食谱中,我们将介绍如何利用神经网络来训练神经网络。
Getting ready
In this recipe, we will use an iris dataset as our example dataset. We will first split the iris
dataset into a training and testing datasets, respectively.
在这个配方中,我们将使用IRIS数据集作为我们的示例数据集。我们将首先分割成一个训练和测试数据集的虹膜数据集,分别。
How to do it...
Perform the following steps to train a neural network with neuralnet:
执行以下步骤来训练一个神经网络与神经网络:
- First load the iris dataset and split the data into training and testing datasets:
首先加载虹膜数据集并将数据分割成训练和测试数据集:
> data(iris)
> ind = sample(2, nrow(iris), replace = TRUE, prob=c(0.7, 0.3))
> trainset = iris[ind == 1,]
> testset = iris[ind == 2,]
2. Then, install and load the neuralnet package:
然后,安装和负荷的神经网络软件包:
> install.packages("neuralnet")
> library(neuralnet)
3. Add the columns versicolor, setosa, and virginica based on the name matched value
in the Species column:
3.添加列云芝,粗糙,和锦葵基于名称匹配的值在物种列:
> trainset$setosa = trainset$Species == "setosa"
> trainset$virginica = trainset$Species == "virginica"
> trainset$versicolor = trainset$Species == "versicolor"
4. Next, train the neural network with the neuralnet function with three hidden
neurons in each layer. Notice that the results may vary with each training, so you
might not get the same result. However, you can use set.seed at the beginning, so
you can get the same result in every training process
4。其次,训练神经网络与神经网络的功能有三个隐层神经元。注意,每次培训的结果可能会有所不同,所以你可能不会得到相同的结果。然而,你可以使用set.seed开始,所以你可以在每一个训练过程得到相同的结果。
> network = neuralnet(versicolor + virginica + setosa~ Sepal.
Length + Sepal.Width + Petal.Length + Petal.Width, trainset,
hidden=3)
> network
Call: neuralnet(formula = versicolor + virginica + setosa ~ Sepal.
Length + Sepal.Width + Petal.Length + Petal.Width, data =http://www.mamicode.com/
trainset, hidden = 3)
1 repetition was calculated.
Error Reached Threshold Steps
1 0.8156100175 0.009994274769 11063
5. Now, you can view the summary information by accessing the result.matrix
attribute of the built neural network model:
5。现在,你可以通过访问建立的神经网络模型的result.matrix属性查看摘要信息:
> network$result.matrix
error 0.815610017474
reached.threshold 0.009994274769
steps 11063.000000000000
Intercept.to.1layhid1 1.686593311644
Sepal.Length.to.1layhid1 0.947415215237
Sepal.Width.to.1layhid1 -7.220058260187
Petal.Length.to.1layhid1 1.790333443486
Petal.Width.to.1layhid1 9.943109233330
Intercept.to.1layhid2 1.411026063895
Sepal.Length.to.1layhid2 0.240309549505
Sepal.Width.to.1layhid2 0.480654059973
Petal.Length.to.1layhid2 2.221435192437
Petal.Width.to.1layhid2 0.154879347818
Intercept.to.1layhid3 24.399329878242
Sepal.Length.to.1layhid3 3.313958088512
Sepal.Width.to.1layhid3 5.845670010464
Petal.Length.to.1layhid3 -6.337082722485
Petal.Width.to.1layhid3 -17.990352566695
Intercept.to.versicolor -1.959842102421
1layhid.1.to.versicolor 1.010292389835
1layhid.2.to.versicolor 0.936519720978
1layhid.3.to.versicolor 1.023305801833
Intercept.to.virginica -0.908909982893
1layhid.1.to.virginica -0.009904635231
1layhid.2.to.virginica 1.931747950462
1layhid.3.to.virginica -1.021438938226
Intercept.to.setosa 1.500533827729
1layhid.1.to.setosa -1.001683936613
1layhid.2.to.setosa -0.498758815934
1layhid.3.to.setosa -0.001881935696
- Lastly, you can view the generalized weight by accessing it in the network:
6,最后,您可以通过在网络中访问它来查看广义权重:
> head(network$generalized.weights[[1]])
How it works...
The neural network is a network made up of artificial neurons (or nodes). There are three
types of neurons within the network: input neurons, hidden neurons, and output neurons.
In the network, neurons are connected; the connection strength between neurons is called
weights. If the weight is greater than zero, it is in an excitation status. Otherwise, it is in an
inhibition status. Input neurons receive the input information; the higher the input value, the
greater the activation. Then, the activation value is passed through the network in regard to
weights and transfer functions in the graph. The hidden neurons (or output neurons) then
sum up the activation values and modify the summed values with the transfer function. The
activation value then flows through hidden neurons and stops when it reaches the output
nodes. As a result, one can use the output value from the output neurons to classify the data.
神经网络是一种由人工神经元网络(或节点)。网络中有三种类型的神经元:输入神经元、隐藏神经元和输出神经元,在网络中,神经元连接,神经元之间的连接强度称为权值。如果重量大于零,则处于激发状态。否则,处于抑制状态。输入神经元接收输入信息,输入值越高,激活越大。然后,激活值是通过网络在图中的权重和传递函数方面。隐藏的神经元(或输出神经元),然后总结了激活值和修改的总和值与传递函数。活性值然后流经隐藏神经元和停止时输出节点。因此,可以使用输出神经元的输出值对数据进行分类。
The advantages of a neural network are: first, it can detect nonlinear relationships between
the dependent and independent variable. Second, one can efficiently train large datasets
using the parallel architecture. Third, it is a nonparametric model so that one can eliminate
errors in the estimation of parameters. The main disadvantages of a neural network are that
it often converges to the local minimum rather than the global minimum. Also, it might over-fit
when the training process goes on for too long.
神经网络的优点是:RST,它可以检测依赖和独立变量之间的非线性关系。其次,可以有效地培养大数据集,采用并行体系结构。第三,它是一个非参数模型,这样就可以消除参数估计中的错误。神经网络的主要缺点是,它往往收敛到局部极小,而不是全局最小。此外,它可能超过T时,训练过程持续太久。
In this recipe, we demonstrate how to train a neural network. First, we split the iris dataset
into training and testing datasets, and then install the neuralnet package and load the
library into an R session. Next, we add the columns versicolor , setosa , and virginica
based on the name matched value in the Species column, respectively. We then use the
neuralnet function to train the network model. Besides specifying the label (the column
where the name equals to versicolor, virginica, and setosa) and training attributes in the
function, we also configure the number of hidden neurons (vertices) as three in each layer.
在这个食谱中,我们演示了如何训练神经网络。首先,我们把IRIS数据为训练和测试数据集,然后安装神经网络包和加载的库到一个R会话。接下来,我们添加的列云芝,粗糙,和锦葵基于名称匹配的值在物种列,分别。然后用神经网络函数训练网络模型。除了指定的标签(如名称为云芝,锦葵,和粗糙的柱)和功能训练的属性,我们也配置隐层神经元的个数(顶点)三在每一层。
Then, we examine the basic information about the training process and the trained network
saved in the network. From the output message, it shows the training process needed
11,063 steps until all the absolute partial derivatives of the error function were lower than
0.01 (specified in the threshold). The error refers to the likelihood of calculating Akaike
Information Criterion (AIC). To see detailed information on this, you can access the result.
matrix of the built neural network to see the estimated weight. The output reveals that the
estimated weight ranges from -18 to 24.40; the intercepts of the first hidden layer are 1.69,
1.41 and 24.40, and the two weights leading to the first hidden neuron are estimated as 0.95
( Sepal.Length ), -7.22 ( Sepal.Width ), 1.79 ( Petal.Length ), and 9.94 ( Petal.Width ).
We can lastly determine that the trained neural network information includes generalized
weights, which express the effect of each covariate. In this recipe, the model generates
12 generalized weights, which are the combination of four covariates ( Sepal.Length ,
Sepal.Width , Petal.Length , Petal.Width ) to three responses ( setosa , virginica ,
versicolor ).
然后,我们研究的基本信息的训练过程和训练有素的网络中保存的网络。从输出信息,它示出训练过程所需的11063个步骤,直到所有的绝对偏导数的误差函数均低于0.01(在阈值中指定)。误差是指计算Akaike信息准则(AIC)的可能性。要查看详细信息,您可以访问结果。建立人工神经网络的矩阵估计权值。输出显示,估计重量范围从18到24.40;截获的第一个隐藏层的1.69、1.41和24.40,和两个重量导致第一个隐藏神经元估计为0.95(萼片长度),-7.22(萼片宽度)、1.79(花瓣长度),和9.94(花瓣宽度)。最后,我们可以确定受过训练的神经网络信息包括广义权重,表示各协变量的影响。在这个配方中,模型生成12广义的权重,这是四个变量的组合(sepal.length,萼片宽度、花瓣长度、花瓣宽度)三反应(粗糙、锦葵、云芝)
See also
For a more detailed introduction on neuralnet, one can refer to the following paper:
Günther, F., and Fritsch, S. (2010). neuralnet: Training of neural networks. The R
journal, 2(1), 30-38
一个更详细的介绍了神经网络,可以参考以下文件:古?舌鳎,F,和弗里奇,S(2010)。神经网络:神经网络训练。R学报,2(1),30-38。
Visualizing a neural network trained by neuralnet
想象一个由受过训练的神经网络神经网络
The package, neuralnet , provides the plot function to visualize a built neural network and
the gwplot function to visualize generalized weights. In following recipe, we will cover how to
use these two functions.
这个包,神经网络,提供了绘图功能,可视化的神经网络建模和可视化gwplot函数广义权重。在下面的食谱中,我们将介绍如何使用这两个函数。
Getting ready
You need to have completed the previous recipe by training a neural network and have all
basic information saved in the network.
你需要通过训练一个神经网络来完成以前的配方,并保存在网络中的所有基本信息。
How to do it...
Perform the following steps to visualize the neural network and the generalized weights:
- You can visualize the trained neural network with the plot function:
1。你可以可视化训练的神经网络的情节功能:
> plot(network)
2. Furthermore, you can use gwplot to visualize the generalized weights: > par(mfrow=c(2,2)) > gwplot(network,selected.covariate="Petal.Width") > gwplot(network,selected.covariate="Sepal.Width") > gwplot(network,selected.covariate="Petal.Length") > gwplot(network,selected.covariate="Petal.Width")
How it works...
In this recipe, we demonstrate how to visualize the trained neural network and the generalized
weights of each trained attribute. As per Figure 10, the plot displays the network topology of
the trained neural network. Also, the plot includes the estimated weight, intercepts and basic
information about the training process. At the bottom of the figure, one can find the overall
error and number of steps required to converge.
在这个食谱中,我们演示了如何可视化训练的神经网络和广义权重的每一个训练有素的属性。如图10所示,该图显示受过训练的神经网络的网络拓扑结构。此外,情节包括估计重量,拦截和基本信息的培训过程。在图的底部,可以发现整体误差和收敛所需步骤数。
Figure 11 presents the generalized weight plot in regard to network$generalized.weights .
The four plots in Figure 11 display the four covariates: Petal.Width , Sepal.Width , Petal.
Length , and Petal.Width , in regard to the versicolor response. If all the generalized weights
are close to zero on the plot, it means the covariate has little effect. However, if the overall
variance is greater than one, it means the covariate has a nonlinear effect.
图11给出了关于网络generalized.weights美元广义重情节。四图显示在图11的四个变量:花瓣,萼片宽度、宽度、花瓣。长度,和花瓣,宽度,关于花斑癣反应。如果所有的广义权重接近零的情节,这意味着协变量的影响不大。然而,如果整体方差大于1,这意味着协变量具有非线性效应。
See also
For more information about gwplot , one can use the help function to access the
following document:
关于gwplot的更多信息,可以使用帮助功能访问下列文件:
> ?gwplot
Predicting labels based on a model trainedby neuralnet
基于训练的神经网络预测模型标签
Similar to other classification methods, we can predict the labels of new observations based
on trained neural networks. Furthermore, we can validate the performance of these networks
through the use of a confusion matrix. In the following recipe, we will introduce how to use
the compute function in a neural network to obtain a probability matrix of the testing dataset
labels, and use a table and confusion matrix to measure the prediction performance.
类似于其他的分类方法,我们可以预测基于神经网络新的观测结果的标签。此外,我们可以验证这些网络的性能,通过使用混淆矩阵。在下面的配方中,我们将介绍如何使用神经网络中的计算功能,得到测试数据集标签的概率矩阵,并使用表和混淆矩阵来衡量预测性能。
Getting ready
You need to have completed the previous recipe by generating the training dataset, trainset ,
and the testing dataset, testset . The trained neural network needs to be saved in the network.
你需要通过生成训练数据集,完成了以前的配方的动车组,和测试数据,测试。训练后的神经网络需要保存在网络中。
How to do it...
Perform the following steps to measure the prediction performance of the trained neural
network:
执行以下步骤来测量训练神经网络的预测性能:
1. First, generate a prediction probability matrix based on a trained neural network and
the testing dataset, testset :
1。首先,生成一个基于神经网络的预测概率矩阵和测试数据,测试:
> net.predict = compute(network, testset[-5])$net.result
2. Then, obtain other possible labels by finding the column with the greatest probability:
2。然后,通过在柱以最大概率获得其他可能的标签:
> net.prediction = c("versicolor", "virginica", "setosa")
[apply(net.predict, 1, which.max)]
3. Generate a classification table based on the predicted labels and the labels of the
testing dataset:
3.生成一个基于预测的标签和标签的测试数据集的分类表:
> predict.table = table(testset$Species, net.prediction)
> predict.table
prediction
setosa versicolor virginica
setosa 20 0 0
versicolor 0 19 1
virginica 0 2 16
- Next, generate classAgreement from the classification table:
接下来,从分类表生成类协议:
> classAgreement(predict.table)
$diag
[1] 0.9444444444
$kappa
[1] 0.9154488518
$rand
[1] 0.9224318658
$crand
[1] 0.8248251737
5. Finally, use confusionMatrix to measure the prediction performance:
5。最后,利用混淆矩阵来衡量预测性能:
> confusionMatrix(predict.table)
Confusion Matrix and Statistics
prediction
setosa versicolor virginica
setosa 20 0 0
versicolor 0 19 1
virginica 0 2 16
Overall Statistics
Accuracy : 0.9482759
95% CI : (0.8561954, 0.9892035)
No Information Rate : 0.362069
P-Value [Acc > NIR] : < 0.00000000000000022204
Kappa : 0.922252
Mcnemar‘s Test P-Value : NA
Statistics by Class:
Class: setosa Class: versicolor Class:
virginica
Sensitivity 1.0000000 0.9047619
0.9411765
Specificity 1.0000000 0.9729730
0.9512195
Pos Pred Value 1.0000000 0.9500000
0.8888889
Neg Pred Value 1.0000000 0.9473684
0.9750000
Prevalence 0.3448276 0.3620690
0.2931034
Detection Rate 0.3448276 0.3275862
0.2758621
Detection Prevalence 0.3448276 0.3448276
0.3103448
Balanced Accuracy 1.0000000 0.9388674
0.9461980
How it works...
In this recipe, we demonstrate how to predict labels based on a model trained by neuralnet.
Initially, we use the compute function to create an output probability matrix based on the
trained neural network and the testing dataset. Then, to convert the probability matrix to class
labels, we use the which.max function to determine the class label by selecting the column
with the maximum probability within the row. Next, we use a table to generate a classification
matrix based on the labels of the testing dataset and the predicted labels. As we have
created the classification table, we can employ a confusion matrix to measure the prediction
performance of the built neural network.
在这个谱中,我们演示了如何预测基于训练的神经网络模型的标签。最初,我们使用的计算函数来创建一个输出概率矩阵的基础上受过训练的神经网络和测试数据集。然后,转换概率矩阵类的标签,我们使用which.max功能通过行内的最大概率选择的列确定的类标签。接下来,我们使用一个表来生成一个基于测试数据集的标签和标签的预测分类矩阵。我们已经创建了分类表,我们可以用混淆矩阵测量建立的神经网络的预测性能。
See also
In this recipe, we use the net.result function, which is the overall result of
the neural network, used to predict the labels of the testing dataset. Apart from
examining the overall result by accessing net.result , the compute function also
generates the output from neurons in each layer. You can examine the output of
neurons to get a better understanding of how compute works:
在这个谱中,我们使用net.result功能,这是整体的结果,神经网络,用于预测的测试数据集的标签。除了通过访问net.result检查结果,计算功能也从每层中的神经元产生的输出。你可以检查神经元的输出,以便更好地理解计算机是如何工作的:
> compute(network, testset[-5])
Training a neural network with nnet
The nnet package is another package that can deal with artificial neural networks. This
package provides the functionality to train feed-forward neural networks with traditional
back propagation. As you can find most of the neural network function implemented in
the neuralnet package, in this recipe we provide a short overview of how to train neural
networks with nnet .
该网络包是另一个包,可以处理的人工神经网络的。包提供了训练前馈神经网络与传统的功能反向传播,你可以找到最中实现的神经网络函数的神经网络软件包,这个配方中我们提供了一个简短的分析如何培养神经网络与神经网络
Getting ready
In this recipe, we do not use the trainset and trainset generated from the previous step;
please reload the iris dataset again.
在这个食谱中,我们不使用上一步产生,动车组,动车组;请重新加载虹膜数据集
How to do it...
Perform the following steps to train the neural network with nnet :
执行以下步骤来训练神经网络与神经网络
1. First, install and load the nnet package:
> install.packages("nnet")
> library(nnet)
2. Next, split the dataset into training and testing datasets:
> data(iris)
> set.seed(2)
> ind = sample(2, nrow(iris), replace = TRUE, prob=c(0.7, 0.3))
> trainset = iris[ind == 1,]
> testset = iris[ind == 2,]
3. Then, train the neural network with nnet :
> iris.nn = nnet(Species ~ ., data = http://www.mamicode.com/trainset, size = 2, rang =
0.1, decay = 5e-4, maxit = 200)
# weights: 19
initial value 165.086674
iter 10 value 70.447976
iter 20 value 69.667465
iter 30 value 69.505739
iter 40 value 21.588943
iter 50 value 8.691760
iter 60 value 8.521214
iter 70 value 8.138961
ter 80 value 7.291365
iter 90 value 7.039209
iter 100 value 6.570987
iter 110 value 6.355346
iter 120 value 6.345511
iter 130 value 6.340208
iter 140 value 6.337271
iter 150 value 6.334285
iter 160 value 6.333792
iter 170 value 6.333578
iter 180 value 6.333498
final value 6.333471
converged
4. Use the summary to obtain information about the trained neural network:
> summary(iris.nn)
a 4-2-3 network with 19 weights
options were - softmax modelling decay=0.0005
b->h1 i1->h1 i2->h1 i3->h1 i4->h1
-0.38 -0.63 -1.96 3.13 1.53
b->h2 i1->h2 i2->h2 i3->h2 i4->h2
8.95 0.52 1.42 -1.98 -3.85
b->o1 h1->o1 h2->o1
3.08 -10.78 4.99
b->o2 h1->o2 h2->o2
-7.41 6.37 7.18
b->o3 h1->o3 h2->o3
4.33 4.42 -12.16
How it works...
In this recipe, we demonstrate steps to train a neural network model with the nnet package.
We first use nnet to train the neural network. With this function, we can set the classification
formula, source of data, number of hidden units in the size parameter, initial random
weight in the rang parameter, parameter for weight decay in the decay parameter, and the
maximum iteration in the maxit parameter. As we set maxit to 200, the training process
repeatedly runs till the value of the fitting criterion plus the decay term converge. Finally, we
use the summary function to obtain information about the built neural network, which reveals
that the model is built with 4-2-3 networks with 19 weights. Also, the model shows a list of
weight transitions from one node to another at the bottom of the printed message.
初始随机参数中的权重,衰减参数中的重量衰减参数,以及在麦克斯参数最大迭代。我们的麦克斯200、训练过程反复运行仍将拟合标准值加上衰减项收敛,最后,我们使用汇总函数在建立神经网络的形成入手,揭示了该模型是4-2-3网络与19权重的建立。同时,该模型显示从一个节点到另一个在印刷信息的底单重量转换
See also
For those who are interested in the background theory of nnet and how it is made, please
refer to the following articles:
对于那些在网络背景下的理论感兴趣,它是如何产生的,请参见以下文章:Ripley,博士(1996)模式识别与神经网络。剑桥维纳布尔斯,w.n.,和Ripley,博士(2002)。现代应用统计学与S第四版。施普林格
f Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge
f Venables, W. N., and Ripley, B. D. (2002). Modern applied statistics with S. Fourth
edition. Springer
Predicting labels based on a model trained by nnet
基于预测的标签培养的神经网络模型
As we have trained aneural network with nnet in the previous recipe,we can now predict the labels of the testing dataset based on the trained neural network
.我们训练的神经网络在以前的配方的神经网络,我们现在可以预测基于训练的神经网络的测试数据集的标签.
Furthermore,we can assess the model with a confusion matrix adapted from the caret package.
此外,我们可以评估从插入包装采用混淆矩阵模型
Getting ready
You need to have completed the previous recipe by generating the training dataset,trainset,and the testing dataset, testset, from their is dataset.
.您需要通过生成训练数据集来完成前一个配方,动车组,和测试数据,测试,从他们的数据。
The trained neural network also needs to be saved as iris.nn.
神经网络也需要保存为iris.nn
How to do it...
Performthefollowingstepstopredictlabelsbasedonthetrainedneuralnetwork:
1.Generatethepredictionsofthetestingdatasetbasedonthemodel,iris.nn:
>iris.predict=predict(iris.nn,testset,type="class")
2.Generateaclassificationtablebasedonthepredictedlabelsandlabelsofthetesting
dataset:
>nn.table=table(testset$Species,iris.predict)
iris.predict
setosaversicolorvirginica
setosa1700
versicolor0140
virginica0114
3.Lastly,generateaconfusionmatrixbasedontheclassiicationtable:
>confusionMatrix(nn.table)
ConfusionMatrixandStatistics
iris.predict
setosaversicolorvirginica
setosa1700
versicolor0140
virginica0114
OverallStatistics
Accuracy:0.9782609
95%CI:(0.8847282,0.9994498)
NoInformationRate:0.3695652
P-Value[Acc>NIR]:<0.00000000000000022204
Kappa:0.9673063
Mcnemar‘sTestP-Value:NA
StatisticsbyClass:
Class:setosaClass:versicolor
Sensitivity1.00000000.9333333
Specificity1.00000001.0000000
PosPredValue1.00000001.0000000
NegPredValue1.00000000.9687500
Prevalence0.36956520.3260870
DetectionRate0.36956520.3043478
DetectionPrevalence0.36956520.3043478
BalancedAccuracy1.00000000.9666667
Class:virginica
Sensitivity1.0000000
Specificity0.9687500
PosPredValue0.9333333
NegPredValue1.0000000
Prevalence0.3043478
DetectionRate0.3043478
DetectionPrevalence0.3260870
BalancedAccuracy0.9843750
How it works...
Similar to other classiication methods,one can also predict labels based on the neural networks trained by nnet.First,we use the predict function to generate the predicted labels based on a testing dataset, testset.Within the predict function,we specify the type argument to the class,so the output will be class labels in stead of a probability matrix.Next,we use the table function to generate a classification table based on predicted labels and labels written in the testing dataset.Finally,as we have created the classification table,we can employ a confusion matrix from the caret package to measure the prediction performance of the trained neural network.
类似于其他的分类方法,也可以预测标签的基础上的神经由受过训练的神经网络网络。首先,我们使用的预测功能来生成预测标签基于测试数据、测试。在功能的预测,我们指定类的参数,所以输出将是类标签代替概率矩阵。接下来,我们使用表函数生成一个分类表的基础上预测标签和在测试数据集中写入的标签,最后,我们创建了分类表,我们可以从符号包测量预测采用混淆矩阵训练神经网络的性能
See also
For the predict function,if the type argument to class is not speciied,by default,it will generate a probability matrix as a prediction result,which isvery similar to net.result generated from the compute function within the neuralnet package:
对于预测函数,如果未指定类的类型参数,默认情况下,它会生成一个概率矩阵作为预测结:
>head(predict(iris.nn,testset)
翻译文章