Computer Science ›› 2018, Vol. 45 ›› Issue (1): 67-72.doi: 10.11896/j.issn.1002-137X.2018.01.010

Previous Articles     Next Articles

Entity Hyponymy Acquisition and Organization Combining Word Embedding and Bootstrapping in Special Domain

MA Xiao-jun, GUO Jian-yi, XIAN Yan-tuan, MAO Cun-li, YAN Xin and YU Zheng-tao   

  • Online:2018-01-15 Published:2018-11-13

Abstract: The semantic relation of entity hypomypy is important to build the domain knowledge graphs.The organization of hierarchical relations is not considered in the traditional method of extracting hyponymy.A method of extracting and organizing the entity hyponymy in the specific field was proposed in this paper,which combines the word embedding and bootstrapping method.Firstly,the tourism corpus is selected as seed corpus,then the hyponymy patterns included in the seed corpus are clustered based on the method of word embedding similarity.Thus,the patterns of high-confidence level are filtrated which is used to identify hyponymy in the unlabeled corpus.After that,the high-confidence instances of relation are obtained which are selected to put in the seed sets.And the next iteration is performed until all the instances of relation are obtained.Finally,the mapping learning methods are applied to conduct the hierarchical relation of domain entity based on the character of the entity of domain hierarchical relations and the vector-deviation of the hyponymy pairs of the entity.The experimental results show that the proposed method improves the F-value by 10% compared with the traditional method.

Key words: Hyponymy relation,Relation extraction,Bootstrapping method,Word embedding,Projection learning,Hierarchical relation organization

[1] MILLER G A.WordNet:a lexical database for English[J].Communications of the ACM,1995,38(11):39-41.
[2] SHEN D R.SKM:A Schema Matching Model Based on Schema Structure and Known Matching Knowledge[J].Journal of Software,2009,20(2):327-338.
[3] HEARST,MARTI A.Automatic acquisition of hyponyms from large text corpora[C]∥Conference on Computational Linguistics.1992:539-545.
[4] MANN G S.Fine-grained proper noun ontologies for questionanswering[C]∥The Workshop on Building & Using Semantic Networks.Association for Computational Linguistics,2003.
[5] FLEISCHMAN M,HOVY E,ECHIHABI A,et al.Offlinestrategies for online question answering:answering questions before they are asked[C]∥Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2003:1-7.
[6] ANDO M,SEKINGE S,ISHIZAKI S.Automatic Extraction of Hyponyms from Newpaper Using Lexicosyntactic Pattern[J].Ipsj Sig Notes,2004,2003:77-82.
[7] LIU L,CAO C G,WGNG H T,et al.A Method of Hyponym Acquisition Based on“ isa” Pattern[J].Computer Science,2006,33(9):146-151.(in Chinese) 刘磊,曹存根,王海涛,等.一种基于“是一个”模式的下位概念获取方法[J].计算机科学,2006,33(9):146-151.
[8] NAKAYA N,KUREMATSU M,YAMAGUCHI T.A Domain Ontology Development Environment Using a MRD and Text Corpus[J].Casopís Lékar Ceskych,2002,128(37):1166-1169.
[9] SUMIDA A,TORISAWA K.Hacking Wikipedia for hyponymy relations acquisition[C]∥International Joint Conference on Natural Language Processing.2008.
[10] SUCHANEK FM,KASNECI G,WEIKUM G.Yago:A core of semantic knowledge unifying wordnet and wikipedia[C]∥Proceedings of the Third International Joint Conference on Natural Language Processing.2008:883-888.
[11] FAN Q H,ZAN H Y,CHAI Y M,et al(1)hyponym discovery of multiple resource fusion[J].Computer Engineering and Design,2013,34(12):4310-4315.(in Chinese) 范庆虎,昝红英,柴玉梅,等.多资源融合的下位词发现[J].计算机工程与设计,2013,34(12):4310-4315.
[12] CARABALLO S A.Automatic Acquisition of a Hypernym-Labeled Noun Hierarchy from Text[C]∥Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics.1999:120-126.
[13] BOELLA G,CARO L D.Extracting Definitions and Hypernym Relations Relying on Syntactic Dependencies and Support Vector Machines[C]∥Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics.2013:532-537.
[14] ETZIONI O,BANKO M,SODERLAND S,et al.Open information extraction from the web[C]∥International Joint Confe-rence on Artifical Intelligence.Morgan Kaufmann Publishers Inc.,2007:68-74.
[15] FANG N M,NON-MEMBER C Y,MEMBER F R.Hyponym extraction from the web by bootstrapping[J].IEEJ Transactions on Electrical & Electronic Engineering,2012,7(7):62-68.
[16] KOZAREVA Z,HOVY E.A semi-supervised method to learnand construct taxonomies using the web[C]∥Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics.2010:1110-1118.
[17] MIKOLOV T,CHEN K,CORRADO G,et al(1)Efficient Estimation of Word Representations in Vector Space[J].Computer Science,2013,3(1):1301-1306.
[18] GOLDBERG Y,LEVY O.word2vec Explained:deriving Miko-lov et al(1)’s negative-sampling word-embedding method [J/OL].https://arxiv.org/abs/1402.3722.
[19] BENNETT J,GROUT R,PEBAY P,et al(1)Numerically stable, single-pass, parallel statistics algorithms[C]∥IEEE International Conference on Cluster Computing and Workshops.2009:1-8.
[20] FU R J,QIN B,LIU T.Exploiting multiple sources for open-domain hypernym discover[C]∥EMNLP.2013:1224-1234.
[21] WANG P,HU J,ZENG H J,et al.Improving Text Classification by Using Encyclopedia Knowledge[C]∥IEEE International Conference on Data Mining.IEEE,2007:332-341.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .