首页 > 代码库 > 应用高斯分布来解决异常检测问题(二)
应用高斯分布来解决异常检测问题(二)
(原创文章,转载请注明出处!)
在文章应用高斯分布来解决异常检测问题(一)中对如何使用高斯分布来解决异常检测问题进行了描述,本篇是使用R编程实现了第一篇中所描述的两个模型:多个一元高斯分布模型和一个多元高斯分布模型。
一、 多个一元高斯分布模型
1 ## parameters: 2 ## x - a vector, which is the data of new samples. 3 ## X - a matrix, which stores samples‘ data. 4 ## parameterFile - path of paramter file, 5 ## the paramter file stores the paramters of the MultiUnivariate Norm model. 6 ## isTraining - flag, TRUE will trigger the training, 7 ## FALSE will skip the training. 8 funMultiUnivariateNorm <- function(x, X = NULL, parameterFile = ".MultiUnivariateNorm", isTraining = FALSE) 9 {10 if (isTraining == TRUE) {11 if (is.null(X) == TRUE) {12 cat("X is NULL, MultiUnivariateNorm model Can‘t be trained\n")13 return14 } 15 numOfSamples <- dim(X)[1]16 numOfFeatures <- dim(X)[2]17 18 vectrMean <- colMeans(X)19 vectrSD <- numeric(0)20 for (i in 1:numOfFeatures) {21 vectrSD[i] <- sd(X[,i])22 }23 24 ## write the parameters to the file25 ## 1st line is means divided by one blank 26 ## 2nd line is SDs divided by one blank27 matrixMeanSD <- matrix(c(vectrMean, vectrSD), ncol=numOfFeatures, byrow=TRUE)28 # checking of parameterFile leaves to write.table29 write.table(x=matrixMeanSD, file=parameterFile, row.names=FALSE, col.names=FALSE, sep=" ")30 } else {31 matrixMeanSD <- read.table(file=parameterFile)32 matrixMeanSD <- as.matrix(matrixMeanSD)33 vectrMean <- matrixMeanSD[1,]34 vectrSD <- matrixMeanSD[2,] 35 }36 37 vectrProbabilityNewSample <- dnorm(x, mean = vectrMean, sd = vectrSD, log = FALSE)38 prod(vectrProbabilityNewSample) # probability of the new sample39 }
二、 一个多元高斯分布模型
1 ## Before using this function the package mvtnorm need to be installed. 2 ## To install package mvtnorm, issuing command install.packages("mvtnorm") 3 ## and using command library(mvtnorm) to load the package to R workspace. 4 ## 5 ## parameters: 6 ## x - a vector, the data of one samples that need to be calculate the output by the MultiUnivariate Norm model. 7 ## a matrix, each line is one sample that need to be calculate the output by the MultiUnivariate Norm model. 8 ## X - a matrix, which stores samples‘ data. 9 ## parameterFile - path of paramter file, 10 ## the paramter file stores the paramters of the MultiUnivariate Norm model.11 ## isTraining - flag, TRUE will trigger the training, 12 ## FALSE will skip the training.13 funMultivariateNorm <- function(x, X = NULL, parameterFile = ".MultivariateNorm", isTraining = FALSE) 14 {15 if (isTraining == TRUE) {16 if (is.null(X) == TRUE) {17 cat("X is NULL, MultivariateNorm model Can‘t be trained\n")18 return19 } 20 21 vectrMean <- colMeans(X)22 matrixSigma <- cov(X)23 ## write the parameters to the file24 ## 1st line is means divided by one blank 25 ## from the 2nd line to the last line are variances divided by one blank26 matrixMeanCov <- rbind(vectrMean, matrixSigma)27 # checking of parameterFile leaves to write.table28 write.table(x=matrixMeanCov, file=parameterFile, row.names=FALSE, col.names=FALSE, sep=" ")29 } else {30 matrixMeanCov <- read.table(file=parameterFile)31 matrixMeanCov <- as.matrix(matrixMeanCov)32 vectrMean <- matrixMeanCov[1,]33 matrixSigma <- matrixMeanCov[c(2:dim(matrixMeanCov)[1]),] 34 }35 36 dmvnorm(x, mean = vectrMean, sigma = matrixSigma, log = FALSE) # probability of the new samples
37 }
应用高斯分布来解决异常检测问题(二)
声明:以上内容来自用户投稿及互联网公开渠道收集整理发布,本网站不拥有所有权,未作人工编辑处理,也不承担相关法律责任,若内容有误或涉及侵权可进行投诉: 投诉/举报 工作人员会在5个工作日内联系你,一经查实,本站将立刻删除涉嫌侵权内容。