计算机科学 ›› 2013, Vol. 40 ›› Issue (12): 41-44.

• 综述 • 上一篇    下一篇

Web访问序列模式挖掘算法的研究

李陶深,王伟娜,陈庆峰   

  1. 广西大学计算机与电子信息学院 南宁530004;广西大学计算机与电子信息学院 南宁530004;广西大学计算机与电子信息学院 南宁530004
  • 出版日期:2018-11-16 发布日期:2018-11-16
  • 基金资助:
    本文受国家自然科学基金项目(60973074)资助

On Sequential Pattern Mining Algorithm for Web Access

LI Tao-shen,WANG Wei-na and CHEN Qing-feng   

  • Online:2018-11-16 Published:2018-11-16

摘要: 针对现有Web访问序列模式挖掘算法和PrefixSpan算法存在的问题,提出一种基于投影位置的Web访问序列模式挖掘算法(PWSPM)。该算法通过序列模式分析,发现用户的行为模式,预测用户对网页的访问模式,进而改进站点的性能和组织结构,提高用户查找信息的质量和效率,以及对用户开展个性化的信息服务。实验和应用结果表明,提出的算法具有更好的执行效率,适用于Web日志挖掘,可用于构建智能化Web站点和解决个性化的信息服务问题。

关键词: Web访问,序列模式,数据挖掘,PrefixSpan算法,Web日志挖掘

Abstract: In view of the problems existing in present sequential pattern mining algorithm for Web access and PrefixSpan algorithm,a sequential pattern mining algorithm for Web access based on projection position-based (PWSPM) was proposed.This algorithm uses sequence pattern analysis to find the user’s behavior model and predict user’s access pattern to home pages.And then,according to analytical results,it improves sites performance and organizational structure to increase the quality and efficiency of the users to find information.Experimental and application results show that PSPM-Web algorithm has better runtime performance and extensibility.It can apply in Web log mining and is used to build intelligent Web sites and solve the personalized information services.

Key words: Web access,Sequential pattern,Data mining,PrefixSpan algorithm,Web log mining

[1] Han J,Pei J,Mortazviasl B,et al.FreeSpan:Frequent pattern projected sequential pattern mining[C]∥Proceedings of the 6th ACM-SIGKDD International Conference on Knowledge Disco-very and Data Mining.New York:ACM Press,2000:355-359
[2] Pei J,Han J,Mortazavi-As1J,et al.PrefixSpan:Mining sequential patterns efficiently by prefix-projected pattern growth[C]∥ Proceedings of the 7th International Conference on Data Engineering.IEEE Computer Society Washington,DC,USA,2001:215-224
[3] 陆介平,刘月波,倪巍伟,等.基于投影数据库的序列模式挖掘增量式更新算法[J].东南大学学报,2007,6(3):457-462
[4] 张坤,朱杨勇.无重复投影数据库扫描的序列模式挖掘算法[J].计算机研究与发展,2007,4(1):126-132
[5] 张利军,李战怀.基于位置信息的序列模式挖掘算法[J].计算机应用研究,2009,26(2):529-531
[6] 汪林林,范军.基于PrefixSpan的序列模式挖掘改进算法[J].计算机工程,2009,35(23):56-58,1
[7] Saputra D,Rambli D R A,Foong O M.Sequential Pattern Mi-ning using PrefixSpan with Pseudoprojection and Separator Database[C]∥2008International Symposium on Information Technology.Kuala Lumpur,Malaysia,Aug.2008:1-7
[8] 公伟,刘培玉,贾娴.基于改进PrefixSpan的序列模式挖掘算法[J].计算机应用,2011,1(9):2045-2047
[9] 鲍钰,黄国兴,张召.基于Web日志挖掘的网站结构优化方法[J].计算机工程,2003,9(12):82-84
[10] 胡建武,何贞铭,张贻权.Web日志挖掘及其实现[J].计算机工程与应用,2004,0(14):156-158
[11] Yang Z L,Wang Y T,Kitsuregawa M M.An Effective system for mining web log[C]∥Proceedings of the 8th Asia-Pacific Web conference on Frontiers of WWW Research and Development (APWeb’06).Harbin,China,January 2006:40-52
[12] Mabroukeh N R,Ezeife C I.A Taxonomy of Sequential Pattern Mining Algorithms[J].Journal of ACM Computing Surveys,2010,43(1):1-41

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!