码迷,mamicode.com
首页 > 其他好文 > 详细

Constraint-Based Pattern Mining

时间:2015-02-21 21:05:46      阅读:198      评论:0      收藏:0      [点我收藏+]

标签:数据   约束   pattern   

在数据挖掘中,如何进行有约束地挖掘,如何对待挖掘数据进行条件约束与筛选,是本文探讨的话题。

Why do we use constraint-based pattern mining? Because we’d like to apply different pruning methods to constrain pattern mining process.
And for those reasons:

  • Finding all the patterns in a dataset autonomously? — unrealistic!
    • Too many patterns but not necessarily user-interested!
  • Pattern mining should be an interactive process
    • User directs what to be mined using a data mining query language (or a graphical user interface)
  • Constraint-based mining
    • User flexibility: provides constraints on what to be mined
    • Optimization: explores such constraints for efficient mining
      • Constraint-based mining: Constraint-pushing, similar to push selection first in DB query processing

Constraints in General Data Mining

A data mining query can be in the form of a meta-rule or with the following language primitives
* Knowledge type constraint:
* Ex.: classification, association, clustering, outlier finding, ….
* Data constraint — using SQL-like queries
* Ex.: find products sold together in NY stores this year
* Dimension/level constraint
* Ex.: in relevance to region, price, brand, customer category
* Rule (or pattern) constraint
* Ex.: small sales (price < $10) triggers big sales (sum > $200)
* Interestingness constraint
* Ex.: strong rules: min_sup 0.02, min_conf 0.6, min_correlation 0.7

技术分享

Different Kinds of Constraints: Different Pruning Methods

  • Constraints can be categorized as
    • Pattern space pruning constraints vs. data space pruning constraints
  • Pattern space pruning constraints
    • Anti-monotonic: If constraint c is violated, its further mining can be terminated
    • Monotonic: If c is satisfied, no need to check c again
    • Succinct: if the constraint c can be enforced by directly manipulating the data
    • Convertible: c can be converted to monotonic or anti-monotonic if items can be properly ordered in processing
  • Data space pruning constraints
    • Data succinct: Data space can be pruned at the initial pattern mining process
    • Data anti-monotonic: If a transaction t does not satisfy c, then t can be pruned to reduce data processing effort.

Pattern Anti-monotonicity

技术分享

这里range(S.profit)指的是max-min
这里因为随着item的增多,itemset S的support会逐渐减小,所以ex4的答案是yes

Pattern Monotonicity

技术分享

Data Anti-monotonicity

技术分享

Succinct Constraints

技术分享

Convertible Constraints

技术分享
这里,我们将transaction里面的item进行递减或递增排序,此时就可以将constraint转化为monotone或anti-monotone.

技术分享
参考上面的说明,即可得出结论:这里我们会选择一个T中的一个或几个item,此时item是有顺序的。
注意我们都将按照right order进行pattern generation

Constraint-Based Pattern Mining

标签:数据   约束   pattern   

原文地址:http://blog.csdn.net/rk2900/article/details/43899177

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!