计算机科学 ›› 2015, Vol. 42 ›› Issue (5): 260-264.doi: 10.11896/j.issn.1002-137X.2015.05.052

• 人工智能 • 上一篇    下一篇

一种基于逻辑的频繁序列模式挖掘算法

刘端阳,冯 建,李晓粉   

  1. 浙江工业大学计算机科学与技术学院 杭州310023,浙江工业大学计算机科学与技术学院 杭州310023,浙江工业大学计算机科学与技术学院 杭州310023
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受浙江省自然科学基金(LY14F020018),国家自然科学基金(61202204)资助

Logic-based Frequent Sequential Pattern Mining Algorithm

LIU Duan-yang, FENG Jian and LI Xiao-fen   

  • Online:2018-11-14 Published:2018-11-14

摘要: 传统的类Apriori频繁序列模式挖掘算法都是基于支持度框架理论,需要预先设定支持度阈值,而这通常需要较深的领域知识或大量的实践,因此目前仍没有一种很好的设定方法。同时,序列模式的挖掘结果往往数量很大且不易理解,可用性较低。针对上述问题,提出了一种基于逻辑的频繁序列模式挖掘算法即LFSPM算法,并首次在频繁序列模式挖掘算法中引入了逻辑的思想,通过逻辑规则过滤,大大优化了结果集。实验证明,该算法较好地解决了支持度设置问题及挖掘结果可理解性不高的问题。

关键词: 频繁序列模式,数据挖掘,逻辑,支持度阈值

Abstract: Traditional Apriori-like sequential pattern mining algorithms are based on the theoretical framework of support,which need pre-set support threshold,but this often requires in-depth domain knowledge or a lot of practice.Consequently,there is still no good way to set it.Meanwhile,the results of sequential patterns are too large to understand and apply.To solve these problems,this paper presented a logic-based frequent sequential pattern mining algorithm LFSPM,and introduced the thought of logic into frequent pattern mining process for the first time.Through using logical rules to filter,it optimizes the result sets greatly.Experiments show good performance of the proposed approach to solve these problems.

Key words: Frequent sequential pattern,Data mining,Logic,Support threshold

[1] Agrawal R,Srikant R.Mining sequential patterns[C]∥Procee-dings of the Eleventh International Conference on Data Enginee-ring,1995.IEEE,1995:3-14
[2] Yan X,Han J,Afshar R.CloSpan:Mining closed sequential patterns in large datasets[C]∥Proceedings of SIAM International Conference on Data Mining.2003:166-177
[3] Wang Jian-yong,Han Jia-wei.BIDE:efficient mining of frequent closed sequences[C]∥Proceeding of the 2004 International Conference on Data Engineering.Boston,2004:79-90
[4] 童咏昕,张媛媛,袁玫,等.一种挖掘压缩序列模式的有效算法[J].计算机研究与发展,2010,47(1):72-80
[5] Chang L,Wang T,Yang D,et al.Seqstream:Mining closed sequential patterns over stream sliding windows[C]∥Eighth IEEE International Conference on Data Mining,2008(ICDM’08).IEEE,2008:83-92
[6] Luo C,Chung S M.A scalable algorithm for mining maximalfrequent sequences using a sample[J].Knowledge and Information Systems,2008,15(2):149-179
[7] Chedyeams R,Pascal P,Maguelonne T.Speed:mining maximal sequential patterns over data streams[C]∥Proceedings of the 3rd International IEEE Conference on Intelligent Systems.London:IEEE,2006:546-552
[8] Yan X,Cheng H,Han J,et al.Summarizing itemset patterns:a profile-based approach[C]∥Proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery in Datamining.ACM,2005:314-323
[9] 王涛.在有噪音的环境中挖掘序列模式精简基[J].华中科技大学学报:自然科学版,2006,34(6):35-38
[10] Kum H C,Pei J,Wang W,et al.ApproxMAP:Approximatemining of consensus sequential patterns[C]∥Third SIAM International Conference on Data Mining (SIAM-DM).2003:311-315
[11] Chang L,Yang D,Tang S,et al.Mining compressed sequential patterns[M]∥Advanced Data Mining and Applications.Springer Berlin Heidelberg,2006:761-768
[12] Xin D,Cheng H,Yan X,et al.Extracting redundancy-aware top-kpatterns[C]∥Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.ACM,2006:444-453
[13] 程舒通,徐从富,但红卫.基于偏序压缩技术的频繁序列模式数据挖掘[J].计算机工程与应用,2008,44(3):192-194
[14] Sim A T H,Indrawan M,Zutshi S,et al.Logic-based pattern discovery[J].IEEE Transactions on Knowledge and Data Engineering,2010,22(6):798-811
[15] Chen Chun-hao,Lan Guo-cheng,et al.Mining high coherent association rules with consideration of support measure[J].Expert Systems with Applications,2013,0(16):6531-6537

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!