码迷,mamicode.com
首页 > 其他好文 > 详细

R_Studio(关联)对dvdtrans.csv数据进行关联规则分析

时间:2018-11-03 01:52:27      阅读:232      评论:0      收藏:0      [点我收藏+]

标签:规则方法   form   rem   port   处理   分布   specific   style   check   

 

 


  dvdtrans.csv数据:该原始数据仅仅包含了两个字段(ID, Item) 用户ID,商品名称(共30条)

  技术分享图片

 

  

技术分享图片
#导入arules包
#install.packages("arules")
library (arules)

setwd(D:\\data) 
Gary=read.csv(file="dvdtrans.csv",header=T)

# 将数据转换为arules关联规则方法apriori 可以处理的数据形式.交易数据
# transactions "事务"
Gary<- as(split(Gary$Item, Gary$ID),"transactions")

# 查看一下数据
#attributes(Gary)
summary(Gary)

# 使用apriori函数生成关联规则
rules <- apriori(Gary, parameter=list(support=0.3,confidence=0.5))

# 查看一下数据
inspect(rules)
Gary.R

 

 

实现过程

 

  导入arules包

  对数据进行预处理

#导入arules包
#install.packages("arules")
library (arules)

setwd(D:\\data) 
Gary=read.csv(file="dvdtrans.csv",header=T)

# 将数据转换为arules关联规则方法apriori 可以处理的数据形式.交易数据
# transactions "事务"
Gary<- as(split(Gary$Item, Gary$ID),"transactions")

 

> # 查看一下数据
> #attributes(Gary)
> summary(Gary)
transactions as itemMatrix in sparse format with
 10 rows (elements/itemsets/transactions) and            10行(元素/项集/事务)
 10 columns (items) and a density of 0.3                10列(项)和0.3的密度

most frequent items:                           最常见的项目(频率):
    Gladiator       Patriot   Sixth Sense    Green Mile Harry Potter1       (Other) 
            7             6             6             2             2             7 

element (itemset/transaction) length distribution:          元素(项集/事务)长度分布:
sizes
2 3 4 5 
3 5 1 1 

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   2.00    2.25    3.00    3.00    3.00    5.00 

includes extended item information - examples:
      labels
1 Braveheart
2  Gladiator
3 Green Mile

includes extended transaction information - examples:
  transactionID
1             1
2             2
3             3

 

  生成关联规则

 

> # 使用apriori函数生成关联规则
> rules <- apriori(Gary, parameter=list(support=0.3,confidence=0.5))
Apriori

Parameter specification:
 confidence minval smax arem  aval originalSupport maxtime support minlen maxlen target   ext
        0.5    0.1    1 none FALSE            TRUE       5     0.3      1     10  rules FALSE

Algorithmic control:
 filter tree heap memopt load sort verbose
    0.1 TRUE TRUE  FALSE TRUE    2    TRUE

Absolute minimum support count: 3 

set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[10 item(s), 10 transaction(s)] done [0.00s].
sorting and recoding items ... [3 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [12 rule(s)] done [0.00s].
creating S4 object  ... done [0.00s].
> 
> # 查看一下数据
> inspect(rules)
     lhs                        rhs           support confidence lift     count
[1]  {}                      => {Patriot}     0.6     0.6000000  1.000000 6    
[2]  {}                      => {Sixth Sense} 0.6     0.6000000  1.000000 6    
[3]  {}                      => {Gladiator}   0.7     0.7000000  1.000000 7    
[4]  {Patriot}               => {Sixth Sense} 0.4     0.6666667  1.111111 4    
[5]  {Sixth Sense}           => {Patriot}     0.4     0.6666667  1.111111 4    
[6]  {Patriot}               => {Gladiator}   0.6     1.0000000  1.428571 6    
[7]  {Gladiator}             => {Patriot}     0.6     0.8571429  1.428571 6    
[8]  {Sixth Sense}           => {Gladiator}   0.5     0.8333333  1.190476 5    
[9]  {Gladiator}             => {Sixth Sense} 0.5     0.7142857  1.190476 5    
[10] {Patriot,Sixth Sense}   => {Gladiator}   0.4     1.0000000  1.428571 4    
[11] {Gladiator,Patriot}     => {Sixth Sense} 0.4     0.6666667  1.111111 4    
[12] {Gladiator,Sixth Sense} => {Patriot}     0.4     0.8000000  1.333333 4   

 

R_Studio(关联)对dvdtrans.csv数据进行关联规则分析

标签:规则方法   form   rem   port   处理   分布   specific   style   check   

原文地址:https://www.cnblogs.com/1138720556Gary/p/9899048.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!