一种基于图结构的Web实体排序方法

doi:10.11896/j.issn.1002-137X.2014.05.045

计算机科学 ›› 2014, Vol. 41 ›› Issue (5): 219-222.doi: 10.11896/j.issn.1002-137X.2014.05.045

一种基于图结构的Web实体排序方法

徐曜,赵政文,陈群,刘海龙,杜晶,胡嘉琪,李战怀

西北工业大学计算机学院西安710129;西北工业大学计算机学院西安710129;西北工业大学计算机学院西安710129;西北工业大学计算机学院西安710129;西北工业大学计算机学院西安710129;西北工业大学计算机学院西安710129;西北工业大学计算机学院西安710129

出版日期:2018-11-14 发布日期:2018-11-14
基金资助:
本文受国家973课题(2012CB316203),自然基金重点项目(61033007),国家863项目(2012AA011004),西北工业大学研究生种子基金(Z2013125,Z2013126)资助

Graph-based Web Entity Ranking Method

XU Yao,ZHAO Zheng-wen,CHEN Qun,LIU Hai-long,DU Jing,HU Jia-qi and LI Zhan-huai

Online:2018-11-14 Published:2018-11-14

摘要/Abstract

摘要： 现阶段,用户常常希望利用搜索引擎获得期望的实体,然而传统搜索引擎只能返回包含关键字的多个文档,并不能直接返回用户想要的答案,且现有的实体排序技术主要采用权值叠加的方法,需要很多先验知识对权值进行训练。文中从搜索引擎返回的文档中提取多个候选实体,并提出一种基于图结构的算法PERA(Probabilistic Entity Ranking Algorithm),利用随机游走的思想,在不需要知道相关先验知识的情况下,将候选实体排序。经过实验验证,各个类型的正确实体均有着较高的排序分值。

关键词: Web,实体排序,搜索引擎,图

Abstract: In recent decades,users tend to get expected entities directly.Unfortunately,traditional search engine can only return some documents related to the key words instead of the entities user expect.What’s worse,most state-of-art entity ranking methods adopt the approach of weight stack by considering some factors related to the entities,and need many priori knowledge to train the weights.This paper extracted several candidate entities from the snippets returned by search engine and exploited the ideology of “Random Walk” to raise a graph-based algorithm,PERA(Probabilistic Entity Ranking Algorithm),to rank the candidates without many priori knowledge.The results of experiments show that the target entity gets a high ranking score.

Key words: Web,Entity ranking,Search engine,Graph

徐曜,赵政文,陈群,刘海龙,杜晶,胡嘉琪,李战怀. 一种基于图结构的Web实体排序方法[J]. 计算机科学, 2014, 41(5): 219-222. https://doi.org/10.11896/j.issn.1002-137X.2014.05.045

XU Yao,ZHAO Zheng-wen,CHEN Qun,LIU Hai-long,DU Jing,HU Jia-qi and LI Zhan-huai. Graph-based Web Entity Ranking Method[J]. Computer Science, 2014, 41(5): 219-222. https://doi.org/10.11896/j.issn.1002-137X.2014.05.045

参考文献

[1] 黄云,洪佳明,颜一鸣.基于图的特征词权重算法及其在文档排序中的应用[J]．计算机系统应用,2012(6):216-218
[2] 毕鹏.Web信息检索结果个性化排序模型[J].计算机科学,2004,31(B09):35-37
[3] 王扬,黄亚楼,谢茂强.多查询相关的排序支持向量机融合算法[J].计算机研究与发展,2011,48(4):558-566
[4] Li Xian,Meng Wei-yi,Yu C.T-verifier:Verifying truthfulnessof fact statements[C]∥ 27th International Conference on Data Engineering(ICDE) IEEE.IEEE,2011
[5] Li Zhi-xu,et al.WebPut:efficientWeb-based data imputation[C]∥Web Information Systems Engineering-WISE 2012.Berlin Heidelberg:Springer,2012:243-256
[6] Kahng,Minsuk,Lee S,et al.Ranking objects by following paths in entity-relationship graphs[C]∥Proceedings of the 4th workshop on Workshop for Ph．D．students in information & know-ledge management．ACM,2011
[7] Lovász,László.Random walks on graphs:A survey[M]∥Combinatorics,Paul erdos is eighty(volume 2)．Janor Bolyai Mathematical Society,1993:1-46
[8] Sergey B,Page L．The anatomy of a large-scale hypertextualWeb search engine[J]．Computer Networks and ISDN Systems,1998,30(1):107-117
[9] Kleinberg Jon M.Authoritative sources in a hyperlinked environment[J]．Journal of the ACM (JACM),1999,46(5):604-632
[10] Goldberg David E．Genetic algorithms in search,optimization,and machine learning[M]．Addision-Wesley Professional,1989
[11] 米切尔,曾华军．机器学习[M]．张银奎,译.北京:机械工业出版社,2003
[12] 周明,运筹学,孙树栋．遗传算法原理及应用[M].北京:国防工业出版社,1999
[13] NER．http://nlp.stanford.edu/software/CRF-NER.shtml
[14] OpenNLP．http://opennlp.sourceforge.net/
[15] http://www.grouplens.org/node/74
[16] http://cs.brown.edu/~pavlo/fortune1000/
[17] Miller G A．WordNet:a lexical database for English[J]．Communications of the ACM,1995,38(11):39-41

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

一种基于图结构的Web实体排序方法

Graph-based Web Entity Ranking Method

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0