计算机科学 ›› 2018, Vol. 45 ›› Issue (4): 66-70.doi: 10.11896/j.issn.1002-137X.2018.04.009
• 2017年全国理论计算机科学学术年会 • 上一篇 下一篇
司念文,王衡军,李伟,单义栋,谢鹏程
SI Nian-wen, WANG Heng-jun, LI Wei, SHAN Yi-dong and XIE Peng-cheng
摘要: 针对传统的基于统计模型的词性标注存在人工特征依赖的问题,提出一种有效的基于注意力长短时记忆网络的中文词性标注模型。该模型以基本的分布式词向量作为单元输入,利用双向长短时记忆网络提取丰富的词语上下文特征表示。同时在网络中加入注意力隐层,利用注意力机制为不同时刻的隐状态分配概率权重,使隐层更加关注重要特征,从而优化和提升隐层向量的质量。在解码过程中引入状态转移概率矩阵,以进一步提升标注准确率。在《人民日报》和中文宾州树库CTB5语料上的实验结果表明,该模型能够有效地进行中文词性标注,其准确率高于条件随机场等传统词性标注方法,与当前较好的词性标注模型也十分接近。
[1] LIU Q,ZHANG H P,YU H K,et al.Chinese lexical analysisusing cascaded hidden markov model[J].Journal of Computer Research and Development,2004,41(8):1421-1429.(in Chinese) 刘群,张华平,俞鸿魁,等.基于层叠隐马模型的汉语词法分析[J].计算机研究与发展,2004,41(8):1421-1429. [2] HAN X,HUANG D G.Research on Chinese Part-of-speech tagging based on semi hidden Markov model [J].Journal of Chinese Computer Systems,2015,36(12):2813-2816.(in Chinese) 韩霞,黄德根.基于半监督隐马尔科夫模型的汉语词性标注研究[J].小型微型计算机系统,2015,36(12):2813-2816. [3] ZHAO Y,WANG X L,LIU B Q,et al.Fusion of clustering trigger-pair features for POS tagging based on maximum entropy model [J].Journal of Computer Research and Development,2006,43(2):268-274.(in Chinese) 赵岩,王晓龙,刘秉权,等.融合聚类触发对特征的最大熵词性标注模型[J].计算机研究与发展,2006,43(2):268-274. [4] HE J Z,WANG H F.Chinese word sense disambiguation based on maximum entropy model with feature selection [J].Journal of Software,2010,21(6):1287-1295.(in Chinese) 何径舟,王厚峰.基于特征选择和最大熵模型的汉语词义消歧[J].软件学报,2010,21(6):1287-1295. [5] HONG M C,ZHANG K,TANG J,et al.A Chinese part ofspeech tagging approach using conditional random fields [J].Computer Science,2006,33(10):148-151.(in Chinese) 洪铭材,张阔,唐杰,等.基于条件随机场(CRFs)的中文词性标注方法[J].计算机科学,2006,33(10):148-151. [6] YU D J,GE Y Q,YU Z T.Chinese Part-of-speech tagging based on conditional random field [J].Microelectronics & Computer,2011,28(10):63-66.(in Chinese) 于江德,葛彦强,余正涛.基于条件随机场的汉语词性标[J].微电子学与计算机,2011,28(10):63-66. [7] COLLOBERT R,WESTON J,BOTTOU L,et al.Natural Language Processing(Almost) from Scratch[J].Journal of Machine Learning Research,2011,12(1):2493-2537. [8] ZHENG X,CHEN H,XU T.Deep learning for Chinese word segmentation and POS tagging[C]∥Conference on Empirical Methods in Natural Language Processing.2013. [9] ZHOU Q,WEN L,WANG X,et al.A Hierarchical LSTM Modelfor Joint Tasks[M]∥Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data.Springer International Publishing,2016. [10] HUANG Z,XU W,YU K.Bidirectional LSTM-CRF Models for Sequence Tagging [J].arXiv Preprint.arXiv:1508.01991. [11] BAHDANAU D,CHO K,BENGIO Y.Neural Machine Translation by Jointly Learning to Align and Translate[C]∥Proceeding of International Conference on Learning Representations.2015. [12] CHENG H,FANG H,HE X,et al.Bi-directional Attention with Agreement for Dependency Parsing[C]∥Conference on Empirical Methods in Natural Language Processing.2016. [13] RUSH A M,CHOPRA S,WESTON J.A Neural AttentionModel for Abstractive Sentence Summarization[C]∥Confe-rence on Empirical Methods in Natural Language Processing.2015. [14] 宗成庆.统计自然语言处理[M].北京:清华大学出版社,2008. [15] COTTER A,SHAMIR O,SREBRO N,et al.Better Mini-Batch Algorithms via Accelerated Gradient Methods[C]∥Advances in Neural Information Processing Systems.2011:1647-1655. [16] HINTON G E,SRIVASTAVA N,KRIZHEVSKY A,et al.Improving neural networks by preventing co-adaptation of feature detectors[J].Computer Science,2012,3(4):212-223. [17] BASTIEN F,LAMBLIN P,PASCANU R,et al.Theano:new features and speed improvements[C]∥Deep Learning and Unsupervised Feature Learning, IPS 2012 Workshop.2012. [18] ZHU C H,ZHAO T J,ZHENG D Q.Joint Chinese word segmentation and pos tagging system with undirected graphical models [J].Journal of Electronics & Information Technology,2010,32(3):700-704.(in Chinese) 朱聪慧,赵铁军,郑德权.基于无向图序列标注模型的中文分词词性标注一体化系统[J].电子与信息学报,2010,32(3):700-704. [19] WANG Z,XUE N.Joint POS Tagging and Transition-basedConstituent Parsing in Chinese with Non-local Features[C]∥Meeting of the Association for Computational Linguistics.2014:733-742. [20] YANG L,ZHANG M,LIU Y,et al.Joint POS Tagging and Dependency Parsing with Transition-based Neural Networks[J].arXiv Preprint.arXiv:1704.07616. |
No related articles found! |
|