计算机科学 ›› 2016, Vol. 43 ›› Issue (2): 86-88.doi: 10.11896/j.issn.1002-137X.2016.02.019

• 2015年中国计算机学会人工智能会议 • 上一篇    下一篇

一种基于最近搜索周期被引用频率的改进WPR算法

王旭阳,任国盛   

  1. 兰州理工大学计算机与通信学院 兰州730000,兰州理工大学计算机与通信学院 兰州730000
  • 出版日期:2018-12-01 发布日期:2018-12-01

Improved WPR Algorithm Based on Referenced Frequency in Recent Search Cycle

WANG Xu-yang and REN Guo-sheng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对WPR(Weighted PageRank) 算法存在的在网页搜索方面的主题漂移和偏重旧网页的现象,综合网页的主题特征和最近搜索周期网页的被引用频率两个因素,提出了一种改进的算法WTFPR(Weighted Topic Frequency PageRank)。该算法通过内容分析,采用改进的TD-IDF算法来解决网页相关性,改善主题漂移现象;通过网页的最近搜索周期的被引用频率来提高那些较新而且价值较高的网页的PR值,从而改善偏重旧网页的现象。仿真结果表明,改进后的算法与 WPR 算法相比获得了更好的效果。

关键词: 主题特征,被引用频率,偏重旧网页,搜索周期,主题漂移

Abstract: For the topic drift and bias towards the old pages of WPR(Weighted PageRank) algorithm exist in the Web search,consolidated two factors of Web pages’ topic features and referenced frequency in recent search cycle,we proposed an improved algorithm WTFPR(Weighted Topic Frequency PageRank).The algorithm uses improved TD-IDF algorithm to solve relevance of page by content analysis to reduce the topic drift. The algorithm improves the PR value of new and has high quality by referenced frequency of pages in recent search cycle,reducing bias towards the old pages.Simulation results show that the improved algorithm obtaines better results compared to WPR.

Key words: Topic features,Referenced frequency,Bias towards the old pages,Search cycle,Topic drift

[1] Page L,Brin S,Motwani R,et al.The PageRank citation ran-king:Bringing Order to the Web[R].Stanford Digital Libraries Working Peper,1999
[2] Xing W,Ghorbani A.Weighted pagerank algorithm[C]∥Proceedings Second Annual Conference on Communication Networks and Services Research,2004.IEEE,2004:305-314
[3] Tyagi N,Sharma S.Weighted Page rank algorithm based onnumber of visits of Links of Web page[J].International Journal of Soft Computing and Engineering,2012,2(3):441-446
[4] Huang D,Qi H.Pagerank algorithm research[J].Computer Engineering,2006,32(4):145-146
[5] Yang J,Ling P.Improvement of PageRank Algorithm for Search Engine[J].Computer Engineer,2009,35(22):35-37
[6] Ingongngam P,Rungsawang A.Topic-centric algorithm:a novel approach to Web link analysis[C]∥18th International Confe-rence on Advanced Information Networking and Applications,2004(AINA 2004).IEEE,2004,2:299-301
[7] Davison B D.Topical locality in the Web[C]∥Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval.ACM,2000:272-279
[8] Langville A N,Meyer C D.Google’s PageRank and beyond:The science of search engine rankings[M].Princeton University Press,2011
[9] H Cheng-Hui,Y Jian ,H Fang.A text similarity measurement combining word semantic information with TF-IDF method[J].Chinese Journal of Computers,2011,34(5):856-864
[10] Redlich R M,Nemzow M A.Information life cycle search engine and method:U.S.Patent 8423565[P].2013-4-16

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!