计算机科学 ›› 2010, Vol. 37 ›› Issue (6): 186-190.

• 数据库与数据挖掘 • 上一篇    下一篇

有效挖掘闭合组合序列模式

闫雷鸣,孙志挥,张柏礼,杨明,姚蓓   

  1. (东南大学计算机科学与工程学院 南京210096);(南京师范大学计算机科学与技术学院 南京210097);(南京擎天科技有限公司 南京210002)
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(60873176,60803061),江苏省自然科学基金(Bk2008293)资助。

Mining Closed Composite Sequential Patterns Efficiently

YAN Lei-ming,SUN Zhi-hui,ZHANG Bai-li,YANG Ming,YAO Pei   

  • Online:2018-12-01 Published:2018-12-01

摘要: 序列模式的挖掘是近年来的研究热点之一,目前很多研究都集中在闭合频繁项集与闭合序列模式的挖掘,较少涉及更加复杂、有重要应用价值的组合序列模式。针对任意长度和任意组合次数的频繁组合序列模式,提出了一种挖掘全部闭合的组合序列的算法CloCSP。为克服指数量级的候选序列进行闭合检验的困难,提出了既能生成频繁组合序列,又能有效剪枝,并同时完成闭合检验的混合扩展策略,该策略无需维护候选集。实验表明, CloCSP算法能够有效挖掘出隐藏在序列数据中,尤其是稠密数据集内的闭合组合序列模式,有助于揭示更加复杂的序列模式。

关键词: 频繁序列,闭合组合序列,组合模式,数据挖掘

Abstract: Sequential pattern mining has been an essential mining task and an active research area in recent years. However, existing sequential pattern mining algorithms are designed for closed itemsets or simple closed sectuential patterns,and can hardly extract composite sequential patterns, an important class of patterns consisting of several short segments separated by gaps. An efficient algorithm for mining frequent closed composite sequences with any number of segments of different lengths, CloCSP, was proposed. It adopts a novel composite strategy called Mixed Composite, which not only can produce all of closed composite sequential patterns, but also can efficiently prune the composite space and simultaneously check the sequential patterns closure, accordingly reduces the cost in both runtime and space usage. Experiments on both synthetic and real data have demonstrated that CloCSP can significantly discover all of closed composite sectuential patterns.

Key words: Frequent sequences,Closed composite sequences,Composite Motif,Data mining

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!