基于BGRU-CRF的中文命名实体识别方法

doi:10.11896/j.issn.1002-137X.2019.09.035

摘要/Abstract

摘要： 针对传统的命名实体识别方法存在严重依赖大量人工特征、领域知识和分词效果,以及未充分利用词序信息等问题,提出了一种基于双向门控循环单元(BGRU)神经网络结构的命名实体识别模型。该模型利用外部数据,通过在大型自动分词文本上预先训练词嵌入词典,将潜在词信息整合到基于字符的BGRU-CRF中,充分利用了潜在词的信息,提取了上下文的综合信息,并更加有效地避免了实体歧义。此外,利用注意力机制来分配BGRU网络结构中特定信息的权重,从句子中选择最相关的字符和单词,有效地获取了特定词语在文本中的长距离依赖关系,识别信息表达的分类,对命名实体进行识别。该模型明确地利用了词与词之间的序列信息,并且不受分词错误的影响。实验结果表明,与传统的序列标注模型以及神经网络模型相比,所提模型在数据集MSRA上实体识别的总体F1值提高了3.08%,所提模型在数据集OntoNotes上的实体识别的总体F1值提高了0.16%。

关键词: 命名实体识别, 双向门控循环单元, 注意力机制

Abstract: Aiming at the problem that the traditional named entity recognition method relies heavily on plenty of hand-crafted features,domain knowledge,word segmentation effect,and does not make full use of word order information,anamed entity recognition model based on BGRU(bidirectional gated recurrent unit) was proposed.This model utilizes external data and integrates potential word information into character-based BGRU-CRF by pre-training words into dictionaries on large automatic word segmentation texts,making full use of the information of potentialwords,extracting comprehensive information of context,and more effectively avoiding ambiguity of entity.In addition,attention mechanism is used to allocate the weight of specific information in BGRU network structure,which can select the most relevant characters and words from the sentence,effectively obtain long-distance dependence of specific words in the text,recognize the classification of information expression,and identify named entities.The model explicitly uses the sequence information between words,and is not affected by word segmentation errors.Compared with the traditional sequence labeling model and the neural network model,the experimental results on MSRA and OntoNotes show that the proposed model is 3.08% and 0.16% higher than the state-of-art complaint models on the overall F1 value respectively.

Key words: Attention mechanism, Bidirectional gated recurrent unit, Named entity recognition

中图分类号:

TP391

石春丹, 秦岭. 基于BGRU-CRF的中文命名实体识别方法[J]. 计算机科学, 2019, 46(9): 237-242. https://doi.org/10.11896/j.issn.1002-137X.2019.09.035

SHI Chun-dan, QIN Lin. Chinese Named Entity Recognition Method Based on BGRU-CRF[J]. Computer Science, 2019, 46(9): 237-242. https://doi.org/10.11896/j.issn.1002-137X.2019.09.035

参考文献

[1]DUAN H,ZHENG Y.A Study on Features of the CRFs-based Chinese Named Entity Recognition[J].International Journal of Advanced Intelligence Paradigms,2011,3(2):287-294.
[2]ZHOU G D,SU J.Named Entity Recognition Using an HMM-based Chunk Tagger[C]//Proceedings of the 40^th Annual Mee-ting of the Association for Computational Linguistics(ACL).2002:473-480.
[3]HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF Models for Sequence Tagging[J／OL].https://arxiv.org/abs/1508.01991.
[4]MA X Z,HOVY E.End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF[J／OL].http://adsabs.harvard.edu/abs/2016arXiv160301354M.
[5]CHIU J P C,NICHOLS E.Named Entity Recognition with Bidi-rectional LSTM-CNNs[J／OL].https://arxiv.org/abs/1511.08308.
[6]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al.Neural Architectures for Named Entity Recognition[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL-HLT2016).2016:260-270.
[7]PETERS M E,AMMAR W,BHAGAVATULA C,et al.Semi-supervised Sequence Tagging with Bidirectional Language Mo-dels[C]//Proceedings of the 55th Annual Meeting of the Asso-ciation for Computational Linguistics.2017:1756-1765.
[8]YANG Z,SALAKHUTDINOV R,COHEN W W.TransferLearning for Sequence Tagging with Hierarchical Recurrent Networks[C]//International Conference on Learning Representations(ICLR 2017).2017.
[9]LIU Z,ZHU C,ZHAO T.Chinese Named Entity Recognition with a Sequence Labeling Approach:Based on Characters,or Based on Words?[M].Berlin:Springer Berlin Heidelberg,2010:634-640.
[10]YANG Z,SALAKHUTDINOV R,COHEN W.Multi-TaskCross-Lingual Sequence Tagging from Scratch[J].arXiv:1603.06270,2016.
[11]SHIMAOKA S,STENETORP P,INUI K,et al.An Attentive Neural Architecture for Fine-grained Entity Type Classification[C]//Proceedings of the 5th Workshop on Automated Know-ledge Base Construction.2016.
[12]LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]//Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001).2001:282-289.
[13]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems.Berlin:Springer,2013:3111-3119.
[14]YANG J,ZHANG Y,DONG F.Neural word segmentation with rich pretraining[EB/OL].https://www.researchgate.net/publication/316598949_Neural_Word_Segmentation_with_Rich_Pretraining.
[15]CHEN A,PENG F,SHAN R,et al.Chinese named entity recognition with conditional probabilistic models[EB/OL].https://www.semanticscholar.org/paper/Chinese-Named-Entity-Recog-nition-with-Conditional-Chen-Peng/7c3c13060b7101816a11566-eda4fa21d2a82af9e.
[16]ZHANG S,WEN J,WANG X.Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing.2006:158-161.
[17]ZHOU J,HE L,DAI X,et al.Chinese Named Entity Recognition with a Multi-Phase Model[EB／OL].http://www.docin.com/p-195138504.html.
[18]ZHOU J,QU W,ZHANG F.Chinese Named Entity Recognition via Joint Identification and Categorization[J].Chinese Journal of Electronics,2013,22(2):225-230.
[19]DONG C,ZHANG J,ZONG C,et al.Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition[C]//International Conference on Computer Processing of Oriental Languages.Springer International Publi-shing,2016:239-250.
[20]WANG M,CHE W,MANNING C D.Effective bilingual constraints for semi-supervised learning of named entity recognizers[C]//Twenty-Seventh AAAI Conference on Artificial Intelligence.AAAI Press,2013:919-925.
[21]CHE W X,WANG M Q,MANNING C D,et al.Named entity recognition with bilingual constraints[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013).2013:52-62.
[22]YANG J,ZHANG Y,DONG F.Neural Word Segmentationwith Rich Pretraining[EB／OL].http://www.researchgate.net/profile/Jie_Yang126/publication/318740993_Neural_Word_Segmentation_with_Rich_Pretraining/links/59a4ff84a6fdcc773a389875/Neural-Word-Segmentation-with-Rich-Pretrai-ning.pdf.
[23]ZHANG H N,WU D Y,LIU Y,et al.Chinese Named EntityRecognition Based on Deep Neural Network[J].Journal of Chinese Information Processing,2017,31(4):28-35.(in Chinese)张海楠,伍大勇,刘悦,等.基于深度神经网络的中文命名实体识别[J].中文信息学报,2017,31(4):28-35.
[24]FENG Y H,YU H,SUN G,et al.Named Entity Recognition Method Based on BLSTM[J].Computer Science,2018,45(2):261-268.(in Chinese)冯艳红,于红,孙庚,等.基于BLSTM的命名实体识别方法[J].计算机科学,2018,45(2):261-268.

相关文章 15

[1]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3]	戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[4]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[5]	熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[9]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[10]	汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[11]	金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[12]	熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075
[13]	彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[14]	张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[15]	曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed