计算机科学 ›› 2017, Vol. 44 ›› Issue (10): 228-233.doi: 10.11896/j.issn.1002-137X.2017.10.041

• 人工智能 • 上一篇    下一篇

基于深度置信网络的维吾尔语人称代词待消解项识别

秦越,禹龙,田生伟,赵建国,冯冠军   

  1. 新疆大学信息科学与工程学院 乌鲁木齐830046,新疆大学网络中心 乌鲁木齐830046,新疆大学软件学院 乌鲁木齐830046,新疆大学人文学院 乌鲁木齐830046,新疆大学人文学院 乌鲁木齐830046
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金(61563051,4,61262064),国家自然科学基金重点项目(61331011),新疆维吾尔自治区科技人才培养项目(QN2016YX0051)资助

Anaphoricity Determination of Uyghur Personal Pronouns Based on Deep Belief Network

QIN Yue, YU Long, TIAN Sheng-wei, ZHAO Jian-guo and FENG Guan-jun   

  • Online:2018-12-01 Published:2018-12-01

摘要: 针对维吾尔语人称代词指代消解研究忽略了待消解项识别而引入了噪声的问题,提出一种基于深度置信网络(Deep Belief Networks,DBN)的维吾尔语人称代词待消解项识别方法。在分析维吾尔语人称代词语法特征和语言规则的基础上,总结出包含10项特征的维吾尔语人称代词待消解项特征集。所提方法首先通过逐层贪婪地训练每一层受限玻尔兹曼机(Restricted Boltzmann Machine,RBM)网络,来保证特征向量映射到不同的特征空间,尽可能多地保留特征信息;并在最后一层设置BP网络,对RBM输出的特征向量进行分类,以有监督的方式训练整个网络并进行微调。实验结果表明,所提方法正确识别维吾尔语人称代词待消解项的准确率达到95.17%,比SVM算法提高了9%,从而验证了其有效性和可行性。

关键词: 深度置信网络(DBN),待消解项识别,维吾尔语,特征提取

Abstract: Aiming at the problem that the noise was introduced into the research about anaphoricity determination of personal pronouns in Uyghur language,we represented a method based on deep belief networks(DBN).On the basis of analyzing the grammatical features and linguistic rules of personal pronouns in Uyghur language,we summarized the anaphoricity determination feature set containing ten features.First of all,the Restricted Boltzmann Machine(RBM) network is trained layer by layer in a greedy way,in order to make sure that the feature vector is mapped to the different space so that the characteristic information can be retained as much as possible.Then,the BP network in the last layer is set up and the features of the output vector about RBM are classified,as well as the entire network is trained in a supervised way and it is fine-tuned.The experimental result shows that the accuracy rate of correct recognition of anaphoricity determination about Uyghur personal pronouns reaches 95.17%,which is improved by 9% compared to that of SVM algorithm,and the validation and availability of the method are demonstrated.

Key words: Deep belief networks(DBN),Anaphoricity determination,Uyghur language,Feature extraction

[1] VAN DEEMTER K,KIBBLE R.On coreferring:coreference in MUC and related annotation schemes[J].Computational Linguistics,2006,26(4):629-637.
[2] BERGSMA S,LIN D,GOEBEL R.Distributional Identification of Non-Referential Pronouns[C]∥ Proceedings of the Meeting of the Association for Computational Linguistics.Columbus,Ohio,USA,2008:10-18.
[3] BEAN D L,RILOFF E.Corpus-based identification of non-anaphoric noun phrases[C]∥ Meeting of the Association for Computational Linguistics on Computational Linguistics.Association for Computational Linguistics,1999:373-380.
[4] NG V,CARDIE C.Improving machine learning approaches to coreference resolution[C]∥ Meeting on Association for Computational Linguistics.Association for Computational Linguistics,2002:104-111.
[5] CHEN J C,KONG F,ZHU Q M,et al.Detection of Referential It in Coreference Resolution Based on Tree Kernel [J].Journal of Chinese Information Processing,2010,24(5):24-30.(in Chinese) 陈九昌,孔芳,朱巧明,等.基于树核函数的“it”待消解项识别研究[J].中文信息学报,2010,24(5):24-30.
[6] KONG F,ZHU Q M,ZHOU G D.Anaphoricity Determination for Coreferenee Resolution in English and Chinese Languages [J].Journal of Computer Research and Development,2012,49(5):1072-1085.(in Chinese) 孔芳,朱巧明,周国栋.中英文指代消解中待消解项识别的研究[J].计算机研究与发展,2012,49(5):1072-1085.
[7] ZHANG C,KONG F,ZHOU G D.Detecting Anaphoricity forCoreference Resolution in Interactive Question Answering [J].Journal of Chinese Information Processing,2014,28(4):111-116.(in Chinese) 张超,孔芳,周国栋.交互式问答系统中待消解项的识别方法研究[J].中文信息学报,2014,28(4):111-116.
[8] HU N Q.Feature vector based Chinese Corference Resolution Research and System Implementation[D].Suzhou:SoochowUniversity,2009.(in Chinese) 胡乃全.基于特征向量的中文指代消解研究与系统实现[D].苏州:苏州大学,2009.
[9] ZHOU X Y,LIU J,LUO F,et al.Comparison of Chinese Anaphora Resolution[J].Computer Science,2016,43(2):31-34.(in Chinese) 周炫余,刘娟,罗飞,等.中文指代消解模型的对比研究[J].计算机科学,2016,43(2):31-34.
[10] CHEN J C.Research on Anaphoricity Determination in Corefe-rence Resolution[D].Suzhou:Soochow University,2010.(in Chinese) 陈九昌.指代消解中待消解项识别研究[D].苏州:苏州大学,2010.
[11] ACKLEY D H,HINTON G E,SEJNOWSKI T J.A learning algorithm for boltzmann machines[J].Cognitive Science,1985,9(1):147-169.
[12] NEUKART F,MORARU S A.A Machine Learning Approach for Abstraction based on the Idea of Deep Belief Artificial Neural Networks[J].Procedia Engineering,2014,69(1):1499-1508.
[13] HINTON G E,OSINDERO S,TEH Y W.A Fast Learning Algorithm for Deep Belief Nets[J].Neural Computation,2006,18(7):1527-1554.
[14] ZHOU G D,KONG F,et al.Global learning of noun phrase anaphoricity in coreference resolution via label propagation[C]∥Conference on Empirical Methods in Natural Language Proces-sing(EMNLP 2009).Singapore,2009:978-986.
[15] ROUX N L,BENGIO Y.Representational power of restrictedboltzmann machines and deep belief networks[J].Neural Computation,2008,20(6):1631-1649.
[16] DENG L,YU D,PLATT J.Scalable stacking and learning for building deep architectures[C]∥ IEEE International Conference on Acoustics,Speech & Signal Processing.2012:2133-2136.
[17] YU K,JIA L,CHEN Y Q,et al.Yesterday,today and tomorrow for deep learning [J].Journal of Computer Research and Deve-lopment,2013,50(9):1799-1804.(in Chinese) 余凯,贾磊,陈雨强,等.深度学习的昨天、今天和明天[J].计算机研究与发展,2013,50(9):1799-1804.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!