Computer Science ›› 2017, Vol. 44 ›› Issue (1): 95-99, 127.doi: 10.11896/j.issn.1002-137X.2017.01.018

Previous Articles     Next Articles

Research on Sense Guessing of Chinese Unknown Words Based on Knowledge Graph

ZHU Feng, GU Min, ZHENG Hao, GU Yan-hui, ZHOU Jun-sheng and QU Wei-guang   

  • Online:2018-11-13 Published:2018-11-13

Abstract: Semantic study based on traditional corpus has lots of limits,such as updating infrequently and being language-related.To tackle such issues,sense guessing of Chinese unknown words based on knowledge graph(KG) was proposed in this paper.KG is a semantic network containing entities,concepts and semantic relations.It has a huge number of entities and relations and it is very convenient to add them into the KG,which makes it possible to fix the infrequent updating problem.After the introduction of the structure of knowledge graph,how to get data and ways to process them,some exploration about KG-based sense guessing of Chinese unknown words were excuted.At last,Bai-duBaike,which has the most abundant chinese-related data,is used as the corpus with traditional ones to do experiments that are particularly designed to use one specific sense guessing model.This paper also compared the results of experiments based on different knowledge bases and proposed some improvement work.

Key words: Sense guessing of Chinese unknown words,Semantic annotation,Knowledge graph

[1] SUN Mao-song,ZOU Mao-song.Several problems in Automatic Chinese Word Segmentation[J].Applied Linguistics,1995,16(4):40-46.(in Chinese) 孙茂松,邹嘉彦.汉语自动分词研究中的若干理论问题[J].语言文字应用,1995,16(4):40-46.
[2] CHEN Xiao-he.A package scheme for identifying unlisted words in Chinese segmentation[J].Applied Linguistics,1993,3(3):103-109.(in Chinese) 陈小荷.自动分词中未登录词问题的一揽子解决方案[J].语言文字应用,1999,13(3):103-109.
[3] LUA K T.Prediction of Meaning of Bi-syllabic Chinese Com-pound Words Using Back Propagation Neural Network[J].Computational Processing of Oriental Languages,1997,11(2):133-144.
[4] SHANG Feng-feng,GU Yan-hui,DAI Ru-bing,et al.Researchon the Sense Guessing of Chinese Unknown Words Based on Semantic Knowledge-base of Modern Chinese [J].Acta Scientiarum naturalium Universitatis Pekinensis,2016,2(1):10-16.(in Chinese) 尚芬芬,顾彦慧,戴茹冰,等.基于《现代汉语语义词典》的未登录词语义预测研究[J].北京大学学报:自然科学版,2016,52(1):10-16.
[5] CHEN K,CHEN C.Automatic Semantic Classification for Chinese Unknown Compound Nouns[C]∥Proceedings of the 18th International Conference on Computational Linguistics (COLING),2000.USA,2000:173-179.
[6] CHEN C.Character-sense Association and Compounding Template Similarity:Automatic Semantic Classification of Chinese Compounds[C]∥Proceedings of the 3rd SIGHAN Workshop on Chinese Language Processing.Barcelona.2004:33-40.
[7] LU Xiao-fei.Hybrid Model for Chinese Unknown Word Resolution[D].The Ohio State University,2006.
[8] LU Xiao-fei.Hybrid Model for Semantic Classification of Chinese Unknown Words[C]∥Proceedings of North American Chapter of the Association for Computational Linguistics-Human Language Technologies 07,2007.New York,2007:188-195.
[9] ZHANG Rui-xia,XIAO Han.The construction of Lattice based on HowNet [J].Journal of North China Institute of Water Conservancy and Hydro Electric Power,2008,9(3):53-56.(in Chinese) 张瑞霞,肖汉.基于《知网》的词图构造[J].华北水利水电学院学报,2008,29(3):53-56.
[10] LU Xiao-fei.Hybrid Model for Chinese Unknown Word Resolution[D].The Ohio State University,2006.
[11] LU Xiao-fei.Hybrid Models for Semantic Classification of Chinese Unknown Words[C]∥Proceedings of Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics,2007.USA,2007:188-195.
[12] BORDES A,GABRILOVICH E.Constructing and Mining Web-Scale Knowledge Graphs:WWW 2015 Tutorial[C]∥Procee-dings of International Conference on World Wide Web,2015.Italy,2015:1523.
[13] MASS Y,SAGIV Y.Knowledge Management for KeywordSearch over Data Graphs[C]∥Proceedings of the 23rd ACM International Conference on Information and Knowledge Management,2014.China,2014:2051-2053.
[14] WANG Z,ZHANG J,FENG J L,et al.Knowledge Graph andText Jointly Embedding[C]∥Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing,2014.Qatar,2014:1591-1601.
[15] ROKACH L,MAIMON O.Data mining and knowledge disco-very handbook(2nd ed)[M].US:Springer,2005:321-352.
[16] ALFRED R,FUN T S,TAHIR A,et al.Concepts Labeling of Document Clusters Using a Hierarchical Agglomerative Clustering (HAC) Technique[C]∥The 8th International Conference on Knowledge Management in Organizations.Springer Netherlands,2013:263-272.
[17] TONG H,FALOUTSOS C,PAN J Y.Fast Random Walk with Restart and Its Applications[C]∥Proceedings of IEEE International Conference on Data Mining,2006.China,IEEE Computer Society,2006:613-622.
[18] XIA J,CARAGEA D,HSU W H.Bi-relational Network Analysis Using a Fast Random Walk with Restart[C]∥Proceedings of IEEE International Conference on Data Mining,2009.USA,IEEE Computer Society,2009:1052-1057.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75, 88 .
[2] XIA Qing-xun and ZHUANG Yi. Remote Attestation Mechanism Based on Locality Principle[J]. Computer Science, 2018, 45(4): 148 -151, 162 .
[3] LI Bai-shen, LI Ling-zhi, SUN Yong and ZHU Yan-qin. Intranet Defense Algorithm Based on Pseudo Boosting Decision Tree[J]. Computer Science, 2018, 45(4): 157 -162 .
[4] WANG Huan, ZHANG Yun-feng and ZHANG Yan. Rapid Decision Method for Repairing Sequence Based on CFDs[J]. Computer Science, 2018, 45(3): 311 -316 .
[5] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[6] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[7] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[8] LIU Qin. Study on Data Quality Based on Constraint in Computer Forensics[J]. Computer Science, 2018, 45(4): 169 -172 .
[9] ZHONG Fei and YANG Bin. License Plate Detection Based on Principal Component Analysis Network[J]. Computer Science, 2018, 45(3): 268 -273 .
[10] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99, 116 .