计算机科学 ›› 2016, Vol. 43 ›› Issue (3): 275-278.doi: 10.11896/j.issn.1002-137X.2016.03.051

• 人工智能 • 上一篇    下一篇

基于用户兴趣与主题相关的PageRank算法改进研究

王冲,纪仙慧   

  1. 桂林电子科技大学计算机科学与工程学院 桂林541004,桂林电子科技大学计算机科学与工程学院 桂林541004
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受广西可信软件重点实验室项目(PF14071X),桂林电子科技大学重点教改项目(ZJW07303),广西教改工程项目(2015JGA207)资助

Improved PageRank Algorithm Based on User Interest and Topic

WANG Chong and JI Xian-hui   

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对传统的PageRank算法存在主题漂移、忽略用户兴趣等不足,提出一种基于用户兴趣与主题相关的Page-Rank改进算法——ITPR。为了更好地提高用户搜索质量,利用网页浏览时间与页面篇幅共同构建用户兴趣度因子,用线性拟合月点击量的方法预测用户兴趣度的升降,同时结合网页内容引入主题相关度因子,共同对网页PR值进行适当的修正,使其分配更为合理。仿真实验结果表明,在相同的实验环境下,改进的PageRank算法提升了网页排序质量、查准率以及用户搜索满意度。

关键词: PageRank,用户兴趣,线性拟合,兴趣度预测,主题相关度

Abstract: Aiming at the drifting theme and ignoring user interest of traditional PageRank algorithm,an improved algorithm based on user interest and topic(ITPR) was proposed.In order to satisfy the needs of user better,both browsing time and page length were used to build user interest factor,and its change was predicted by the linear fitting of the hits per month.Meanwhile,topic correlation factor based on page content was introduced,modifying the PR appropriately.The simulation experiment results show that the proposed algorithm achieves better page ranking quality,precision ratio and user’s satisfaction.

Key words: PageRank,User interest,Linear fitting,Interest prediction,Topic relevant

[1] China Internet Network Information Center(CNNIC).The thirty-fourth statistical report of Chinese Internet development[R].(2014-07).http://baike.baidu.com/view/14341540.htm(in Chinese) 中国互联网络信息中心(CNNIC).第34次中国互联网络发展状况统计报告[R].(2014-07).http://www.edu.cn/focus_1658/20140721/t20140721_1152815.shtml
[2] Feng Hai-tao.An improved PageRank algorithm with web time weight[J].Journal of Xi’an University of Posts and Telecommunications,2013,18(2):121-124(in Chinese) 冯海涛.基于网页时间权值的PageRank算法改进[J].西安邮电大学学报,2013,18(2):121-124
[3] Shi Ming-ming.Research on Weighted PageRank algorithm[J].Software Guide,2013,2(2):30-32(in Chinese) 史铭茗.加权PageRank算法研究综述[J].软件导刊,2013,12(2):30-32
[4] Brin S.The anatomy of a large hypertextual Web search engine[J].Computer Networks and ISDN System,1998,30(98):107-117
[5] Shao Jing-jing,Li Bo,Liu Han-ping.An improved pagerank algorithm-adjusting the damping factor[J].Mathematica Applicata,2008,1(S1):57-61(in Chinese) 邵晶晶,李波,刘汉平.PageRank的改进算法——调整阻尼因子[J].应用数学,2008,1(S1):57-61
[6] Lovasz L,et al.Random Walks on Graphs:A Survey [J].Combinatorics,1993,8(4):1-46
[7] Xing W,Ghorbani A.Weighted PageRank algorithm[C]∥Proceedings of Second Annual Conference.Piscataway:IEEE Press,2004:305-314
[8] Manning C D,Raghavan P,Schutze H,et al.Introduction to information Retrieval [M].Beijing:Post & Telecom Press,2010
[9] Tyagi N,Sharma S.Comparative study of various Page Ranking Algorithms in Web Structure Mining (WSM) [J].International Journal of Innovative Technology and Exploring Engineering (IJITEE),2012,1(1):14-19
[10] Taher H.Topic-sensitive PageRank [C]∥Proceedings of the lth International Conference on World Wide WEB.Honolulu.Hawaii:ACM Press,2002:784-796
[11] Li Wei-dong,Lu Ling.Research and application of pageRank algorithm combined with VSM technique[J].Computer and Mo-dernization,2011(7):96-98(in Chinese) 李卫东,陆玲.融合VSM技术的PageRank算法研究与应用[J].计算机与现代化,2011(7):96-98
[12] Wang Zhong-fei,Gong Biao.lmproved pageRank algorithm basedon anchor texts similarity[J].Computer Engineering,2010,6(24):258-260(in Chinese) 王钟斐,工彪.基于锚文本相似度的PageRank改进算法[J].计算机工程,2010,36(24):258-260
[13] Chakrabarti S,Dom B,Gibson D,et al.Automatic Resource Com-pilation by Analyzing Hyperlink Structure and Associated Text[C]∥Proceedings of the 7 ACM-WWW International Conference.Brisbane:ACM Press.1998:65-74
[14] Duan H C,Hu P.Improved pagerank algorithm based on topic character and time factor[J].Computer Engineering and Design,2010,31(4):866-868(in Chinese) 段淮川,胡平.基于主题特征和时间因子的改进PageRank算法[J].计算机工程与设计,2010,1(4):866-868
[15] Kumar G,Duhan N,Sharma A K.Page Ranking Based on Number of Visits of Links of Web Page[C]∥International Conference on Computer & Communication Technology (ICCCT).2011:11-14
[16] Peng Cong,Wu Qiang,Li Ren-fa.An Improved Algorithm ofWeb Page Ranking[J].Microcomputer Information,2010,6(33):72-74(in Chinese) 彭聪,吴强,李仁发.一种改进型的网页排序算法[J].微计算机信息,2010,6(33):72-74
[17] Fang S F.Based on User Feedback PageRank algorithm [J].Computer Technology and Automation,2012,1(1):89-92(in Chinese) 方树峰.基于用户反馈的PageRank改进算法[J].计算技术与自动化,2012,31(1):89-92
[18] Wang D G,Zhou Z G,Liang X.Analysis of pagerank algorithm and its improvement [J].Computer Engineering,2010,36(22):291-293(in Chinese) 王德广,周志刚,梁旭.PageRank算法的分析及其改进[J].计算机工程,2010,36(22):291-293
[19] Mccandless M,Hatcher E,Gospodnetic O,et al.Lucene in action [M].Beijing:Post & Telecom Press,2011
[20] Qiu Z,Fu T T.Lucene 2.0+ Heritrix[M].Beijing:Post & Telecom Press,2007(in Chinese) 邱哲,符滔滔.开发自己的搜索引擎:Lucene 2.0+ Heritrix[M].北京:人民邮电出版社,2007

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!