Computer Science ›› 2016, Vol. 43 ›› Issue (1): 25-29.doi: 10.11896/j.issn.1002-137X.2016.01.006

Previous Articles     Next Articles

Research on Improved PrefixSpan Algorithm and its Application in Web User Behavior Patterns Mining

JI Hao-bo and WANG Jun-hong   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Sequential pattern mining is mining relative time or other mode of high frequency from sequence databases.Based on the PrefixSpan algorithm,the paper proposed an improved adaptive algorithm to improve the problem of expensive construction projection database and low scanning efficiency,through the methods such as using sequence expanding instead of item expanding,abandoning the project databases that the number of sequence is less than min_support and so on.Then the new method was used for Web user behavior pattern mining to analyze and research log records law.Experimental results show that,compared to the PrefixSpan algorithm,the improved algorithm has been improved in the algorithm efficiency.

Key words: Sequence pattern mining,Web log mining,PrefixSpan algorithm

[1] Agrawal R,Srikant R.Mining sequential patterns[C]∥Procee-dings of the Eleventh International Conference on Data Engineering(ICDE’95).Washington,DC:IEEE Computer Society,1995:3-14
[2] Srikant R,Agrawal R.Mining sequential patterns:generaliza-tions and performance improvements[C]∥Proceedings of the 5th International Conference on Extending Database Technology:Advances in Database Technology(EDBT’96).Berlin:Springer-Verlag,1996:3-17
[3] Han J,Pei J,Mortazviasl B,et al.FreeSpan:Frequent pattern projected sequential pattern mining[C]∥Proceedings of the 6th ACM-SIGKDD International Conference on Knowledge Disco-very and Data Mining.New York:ACM Press,2000:355-359
[4] Lu J P,Liu Y B,Ni W W,et al.Incremental updating algorithm for mining sequential patterns based on the projection database[J].Journal of Southeast University,2007,6(3):457-462(in Chinese) 陆介平,刘月波,倪巍伟,等.基于投影数据库的序列模式挖掘增量式更新算法[J].东南大学学报,2007,6(3):457-462
[5] Zhang K,Zhu Y Y.Algorithm of sequence pattern mining without repeated database scan projection[J].Journal of Computer Research and Development,2007,4(1):126-132(in Chinese) 张坤,朱扬勇.无重复投影数据库扫描的序列模式挖掘算法[J].计算机研究与发展,2007,44(1):126-132
[6] Han G W.Research on sequential pattern algorithm for data streams based on prefix sequence tree[D].Qinhuangdao:Yanshan University,2013(in Chinese) 韩高伟.基于前缀序列树的数据流序列模式算法研究[D].秦皇岛:燕山大学,2013
[7] Saputra D,Rambli D R A,Foong O M.Sequential Pattern Mi-ning using PrefixSpan with Pseudoprojection and Separator Database[C]∥International Symposium on Infermation Technology,2008.Kuala Lumpur,Malaysia,2008:1-7
[8] Dai Y X,Lian Y F,Wang H.System security and intrusion detection[M].Beijing:Tsinghua University Press,2002(in Chinese)戴英霞,连一峰,王航.系统安全与入侵检测[M].北京:清华大学出版社,2002
[9] Guo L,Xiang X,Shi Y C.Use Web usage mining to assist online E-Learning assessment[C]∥IEEE International Conference on Advanced Learning Technologies.2004:912-913
[10] Xing G,Shen J.Efficient data mining for web navigation patterns [J].Information and Software Technology,2004,46(1):55-63
[11] Xu H Q,Wang Y C.The path Webpage pre fetching model based on analyzing user access[J].Journal of Software,2003,4(6):1142-1147(in Chinese)许欢庆,王永成.基于用户访问路径分析的网页预取模型[J].软件学报,2003,14(6):1142-1147
[12] Yang Z L,Wang Y T,Kitsuregawa M M.An Effective system for mining web log[C]∥ Proceedings of the 8th Asia-Pacific Web Conference on Frontiers of WWW Research ad Development(APWeb’06).Harbin,China,2006:40-52
[13] Mabroukeh N R,Ezeife C I.A Taxonomy of Sequential Pattern Mining Algorithms[J].Journal of ACM Computing Surveys,2010,43(1):1-41
[14] Hu Y H,Wu F,Liao Y J.An efficient tree-based algorithm for mining sequential patterns with multiple minimum supports[J].Journal of Systems and Software,2013,6(5):1224-1238
[15] Hu Y H,Huang C K T,Kao Y H.Knowledge discovery ofweighted RFM sequential patterns from customer sequence databases[J].Journal of Systems and Software,2013,6(3):779-788
[16] Salehi M,IKamalabadi N.Hybrid recommendation approach for learning material based on sequential pattern of the accessed material and the learner’s preference tree[J].Knowledge-Based Systems,2013,8:57-69
[17] Hu Y H,Tsai C F,Tai C T,et al.A novel approach for mining cyclically repeated patterns with multiple minimum supports[J].Applied Soft Computing,2015,8:90-99
[18] Zhang B B,Lin C W,Gan W S,et al.Maintaining the discovered sequential patterns for sequence insertion in dynamic databases[J].Engineering Applications of Artificial Intelligence,2014,5(10):131-142

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!