Computer Science ›› 2014, Vol. 41 ›› Issue (Z11): 411-418.

Previous Articles     Next Articles

Entities Expansion and Attribute Values Discovery Method Based on Web

LI Gui,CHEN Shao-gang,HAN Zi-yang,LI Zheng-yu,SUN Ping and SUN Huan-liang   

  • Online:2018-11-14 Published:2018-11-14

Abstract: Entities expansion and attribute values discovery has been an important research topic in the field of Web data extraction and integration.In this paper the Web table and domain entity were modeled as bipartite graph.Based on quality score,the expansion entity set will be update iteratively until the expansion entity set’s quality score reaches a local maximum and the expansion entity set will not update.To collect structured numerical or discrete attributes of the entities,we presented a method based on ILP to complete the attribute values discovery of the entities.Experiment results show that the proposed approach outperforms previous techniques in terms of both precision and recall.

Key words: Entity expansion,Attribute values filling,Integer linear program

[1] 刘兵.Web数据挖掘[M].愈勇,薛贵荣,韩定一,译.北京:清华大学出版社,2013
[2] Wang R,Cohen W.Iterative set expansion of named entity using the Web[C]∥Proceedings of the 2008 Eighth IEEE International Conference on Data Mining.2008:1091-1096
[3] Lin Xi-de,Zhao Bo,Weninger T,et al.Entity RelationDis-covery from Web Tables and Links[C]∥Proc.WWW.2010:1145-1146
[4] Wang R,Cohen W.Character-level analysis of semi-structureddocuments for set expansion[C]∥EMNLP.2009
[5] Etzioni O,Cafarella M,Downey D,et al.Web-scale information extraction in KnowItAll[C]∥ WWW.2004:100-110
[6] Pantel P,Crestan E,Borkovsky A,et al.Web-Scale DistributionalSimilarity and Entity Set Expansion[C]∥Proceedings of EMNLP2009.Singapore:ACL,2009:938-947
[7] He Ye-ye,Xin Dong.Set Expansion by Iterative Similarity Ag-gregation[C]∥Proc of WWW 2011.dia:ACM,2011:427-436
[8] Pennaechiotti M,Pantel P.Entity Extraction via Ensemble Semantics[C]∥Proc of EMNLP2009.Singapore:ACL,2009:238-247
[9] Tan Pang-ning,Kumar V.Introduction to Data Mining[M].2005
[10] 李贵,张淼,李征宇,等.基于领域模型的Web数据抽取与集成[J].微电子学与计算机,2012,9(9):152-156
[11] 马安香,张斌,高克宁,等.基于结果模式的Deep Web 数据抽取[J].计算机研究,2009,6(2):280-288
[12] Probst K,Ghani R,Krema M,et al.Semi-supervised learning of at-tribute-value pairs from product descriptions[C]∥Procee-dings of the 20th International Joint Conference on Artifical Intelligence.2007:2838-2843
[13] Pasca M.Organizing and searching the world wide web of facts-step two:harnessing the isdom of the crowds[C]∥Proceedings of the 16th International Conference on World Wide Web.2007:101-110
[14] Wick M,Culotta A,McCallum A.Learning Field Compatibilities to Extract Database Records from Unstructured Text[C]∥EMNLP.2006:603-611

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!