首页 > 代码库 > 用R语言对购物篮数据进行关联分析及可视化

用R语言对购物篮数据进行关联分析及可视化

数据格式:

1001,Choclates
1001,Pencil
1001,Marker
1002,Pencil
1002,Choclates
1003,Pencil
1003,Coke
1003,Eraser
1004,Pencil
1004,Choclates
1004,Cookies
1005,Marker
1006,Pencil
1006,Marker
1007,Pencil
1007,Choclates

R Source Code:

#Install the R package arules
install.packages(“arules”);
#load the arules package
library(“arules”);
# read the transaction file as a Transaction class
# file – csv/txt
# format – single/basket (For ‘basket’ format, each line in the transaction data file represents a transaction
#           where the items (item labels) are separated by the characters specified by sep. For ‘single’ format,
#           each line corresponds to a single item, containing at least ids for the transaction and the item. )
# rm.duplicates – TRUE/FALSE
# cols -   For the ‘single’ format, cols is a numeric vector of length two giving the numbers of the columns (fields)
#           with the transaction and item ids, respectively. For the ‘basket’ format, cols can be a numeric scalar
#           giving the number of the column (field) with the transaction ids. If cols = NULL
# sep – “,” for csv, “\t” for tab delimited
txn = read.transactions(file=”D:\\Transactions_sample.csv”, rm.duplicates= FALSE, format=”single”,sep=”,”,cols =c(1,2));
# Run the apriori algorithm
basket_rules <- apriori(txn,parameter = list(sup = 0.5, conf = 0.9,target=”rules”));
# Check the generated rules using inspect
inspect(basket_rules);
#If huge number of rules are generated specific rules can read using index
inspect(basket_rules[1]);

#To visualize the item frequency in txn file
itemFrequencyPlot(txn);
#To see how the transaction file is read into txn variable.
inspect(txn);

library(arulesViz)
#arulesViz中有很多图形,介绍几个好看的,画图的对象都是rules
plot(rules, shading="order", control=list(main = "Two-key plot"))
plot(rules, method="grouped")
plot(rules, method="graph")

itemFrequencyPlot

Two-key plot

Grouped Matrix

Graph Plot

 

参考文献:

[1] http://prdeepakbabu.wordpress.com/2010/11/13/market-basket-analysisassociation-rule-mining-using-r-package-arules/

[2] http://www.maenchi.com/?p=172