计算机科学 ›› 2019, Vol. 46 ›› Issue (9): 237-242.doi: 10.11896/j.issn.1002-137X.2019.09.035

• 人工智能 • 上一篇    下一篇

基于BGRU-CRF的中文命名实体识别方法

石春丹, 秦岭   

  1. (南京工业大学计算机科学与技术学院 南京211816)
  • 收稿日期:2018-08-13 出版日期:2019-09-15 发布日期:2019-09-02
  • 通讯作者: 秦 岭(1980-),男,硕士,讲师,主要研究领域为面向流程工业的机器学习,E-mail:ql@njtech.edu.cn
  • 作者简介:石春丹(1994-),女,硕士生,主要研究领域为机器学习与深度学习;

Chinese Named Entity Recognition Method Based on BGRU-CRF

SHI Chun-dan, QIN Lin   

  1. (School of Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China)
  • Received:2018-08-13 Online:2019-09-15 Published:2019-09-02

摘要: 针对传统的命名实体识别方法存在严重依赖大量人工特征、领域知识和分词效果,以及未充分利用词序信息等问题,提出了一种基于双向门控循环单元(BGRU)神经网络结构的命名实体识别模型。该模型利用外部数据,通过在大型自动分词文本上预先训练词嵌入词典,将潜在词信息整合到基于字符的BGRU-CRF中,充分利用了潜在词的信息,提取了上下文的综合信息,并更加有效地避免了实体歧义。此外,利用注意力机制来分配BGRU网络结构中特定信息的权重,从句子中选择最相关的字符和单词,有效地获取了特定词语在文本中的长距离依赖关系,识别信息表达的分类,对命名实体进行识别。该模型明确地利用了词与词之间的序列信息,并且不受分词错误的影响。实验结果表明,与传统的序列标注模型以及神经网络模型相比,所提模型在数据集MSRA上实体识别的总体F1值提高了3.08%,所提模型在数据集OntoNotes上的实体识别的总体F1值提高了0.16%。

关键词: 命名实体识别, 双向门控循环单元, 注意力机制

Abstract: Aiming at the problem that the traditional named entity recognition method relies heavily on plenty of hand-crafted features,domain knowledge,word segmentation effect,and does not make full use of word order information,anamed entity recognition model based on BGRU(bidirectional gated recurrent unit) was proposed.This model utilizes external data and integrates potential word information into character-based BGRU-CRF by pre-training words into dictionaries on large automatic word segmentation texts,making full use of the information of potentialwords,extracting comprehensive information of context,and more effectively avoiding ambiguity of entity.In addition,attention mechanism is used to allocate the weight of specific information in BGRU network structure,which can select the most relevant characters and words from the sentence,effectively obtain long-distance dependence of specific words in the text,recognize the classification of information expression,and identify named entities.The model explicitly uses the sequence information between words,and is not affected by word segmentation errors.Compared with the traditional sequence labeling model and the neural network model,the experimental results on MSRA and OntoNotes show that the proposed model is 3.08% and 0.16% higher than the state-of-art complaint models on the overall F1 value respectively.

Key words: Named entity recognition, Bidirectional gated recurrent unit, Attention mechanism

中图分类号: 

  • TP391
[1]DUAN H,ZHENG Y.A Study on Features of the CRFs-based Chinese Named Entity Recognition[J].International Journal of Advanced Intelligence Paradigms,2011,3(2):287-294.
[2]ZHOU G D,SU J.Named Entity Recognition Using an HMM-based Chunk Tagger[C]//Proceedings of the 40th Annual Mee-ting of the Association for Computational Linguistics(ACL).2002:473-480.
[3]HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF Models for Sequence Tagging[J/OL].https://arxiv.org/abs/1508.01991.
[4]MA X Z,HOVY E.End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF[J/OL].http://adsabs.harvard.edu/abs/2016arXiv160301354M.
[5]CHIU J P C,NICHOLS E.Named Entity Recognition with Bidi-rectional LSTM-CNNs[J/OL].https://arxiv.org/abs/1511.08308.
[6]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al.Neural Architectures for Named Entity Recognition[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(NAACL-HLT2016).2016:260-270.
[7]PETERS M E,AMMAR W,BHAGAVATULA C,et al.Semi-supervised Sequence Tagging with Bidirectional Language Mo-dels[C]//Proceedings of the 55th Annual Meeting of the Asso-ciation for Computational Linguistics.2017:1756-1765.
[8]YANG Z,SALAKHUTDINOV R,COHEN W W.TransferLearning for Sequence Tagging with Hierarchical Recurrent Networks[C]//International Conference on Learning Representations(ICLR 2017).2017.
[9]LIU Z,ZHU C,ZHAO T.Chinese Named Entity Recognition with a Sequence Labeling Approach:Based on Characters,or Based on Words?[M].Berlin:Springer Berlin Heidelberg,2010:634-640.
[10]YANG Z,SALAKHUTDINOV R,COHEN W.Multi-TaskCross-Lingual Sequence Tagging from Scratch[J].arXiv:1603.06270,2016.
[11]SHIMAOKA S,STENETORP P,INUI K,et al.An Attentive Neural Architecture for Fine-grained Entity Type Classification[C]//Proceedings of the 5th Workshop on Automated Know-ledge Base Construction.2016.
[12]LAFFERTY J,MCCALLUM A,PEREIRA F.Conditional Random Fields:Probabilistic Models for Segmenting and Labeling Sequence Data[C]//Proceedings of the Eighteenth International Conference on Machine Learning (ICML-2001).2001:282-289.
[13]MIKOLOV T,SUTSKEVER I,CHEN K,et al.Distributed representations of words and phrases and their compositionality[C]//Advances in Neural Information Processing Systems.Berlin:Springer,2013:3111-3119.
[14]YANG J,ZHANG Y,DONG F.Neural word segmentation with rich pretraining[EB/OL].https://www.researchgate.net/publication/316598949_Neural_Word_Segmentation_with_Rich_Pretraining.
[15]CHEN A,PENG F,SHAN R,et al.Chinese named entity recognition with conditional probabilistic models[EB/OL].https://www.semanticscholar.org/paper/Chinese-Named-Entity-Recog-nition-with-Conditional-Chen-Peng/7c3c13060b7101816a11566-eda4fa21d2a82af9e.
[16]ZHANG S,WEN J,WANG X.Word Segmentation and Named Entity Recognition for SIGHAN Bakeoff3[C]//Proceedings of the Fifth SIGHAN Workshop on Chinese Language Processing.2006:158-161.
[17]ZHOU J,HE L,DAI X,et al.Chinese Named Entity Recognition with a Multi-Phase Model[EB/OL].http://www.docin.com/p-195138504.html.
[18]ZHOU J,QU W,ZHANG F.Chinese Named Entity Recognition via Joint Identification and Categorization[J].Chinese Journal of Electronics,2013,22(2):225-230.
[19]DONG C,ZHANG J,ZONG C,et al.Character-Based LSTM-CRF with Radical-Level Features for Chinese Named Entity Recognition[C]//International Conference on Computer Processing of Oriental Languages.Springer International Publi-shing,2016:239-250.
[20]WANG M,CHE W,MANNING C D.Effective bilingual constraints for semi-supervised learning of named entity recognizers[C]//Twenty-Seventh AAAI Conference on Artificial Intelligence.AAAI Press,2013:919-925.
[21]CHE W X,WANG M Q,MANNING C D,et al.Named entity recognition with bilingual constraints[C]//Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2013).2013:52-62.
[22]YANG J,ZHANG Y,DONG F.Neural Word Segmentationwith Rich Pretraining[EB/OL].http://www.researchgate.net/profile/Jie_Yang126/publication/318740993_Neural_Word_Segmentation_with_Rich_Pretraining/links/59a4ff84a6fdcc773a389875/Neural-Word-Segmentation-with-Rich-Pretrai-ning.pdf.
[23]ZHANG H N,WU D Y,LIU Y,et al.Chinese Named EntityRecognition Based on Deep Neural Network[J].Journal of Chinese Information Processing,2017,31(4):28-35.(in Chinese)张海楠,伍大勇,刘悦,等.基于深度神经网络的中文命名实体识别[J].中文信息学报,2017,31(4):28-35.
[24]FENG Y H,YU H,SUN G,et al.Named Entity Recognition Method Based on BLSTM[J].Computer Science,2018,45(2):261-268.(in Chinese)冯艳红,于红,孙庚,等.基于BLSTM的命名实体识别方法[J].计算机科学,2018,45(2):261-268.
[1] 孙中锋, 王静. 用于基于方面情感分析的RCNN-BGRU-HN网络模型[J]. 计算机科学, 2019, 46(9): 223-228.
[2] 郭旭, 朱敬华. 基于用户向量化表示和注意力机制的深度神经网络推荐模型[J]. 计算机科学, 2019, 46(8): 111-115.
[3] 张义杰, 李培峰, 朱巧明. 基于自注意力机制的事件时序关系分类方法[J]. 计算机科学, 2019, 46(8): 244-248.
[4] 方杰, 李培峰, 朱巧明. 基于多注意力机制的事件同指消解方法[J]. 计算机科学, 2019, 46(8): 277-281.
[5] 李舟军,王昌宝. 基于深度学习的机器阅读理解综述[J]. 计算机科学, 2019, 46(7): 7-12.
[6] 沈忱林, 张璐, 吴良庆, 李寿山. 基于双向注意力机制的问答情感分类方法[J]. 计算机科学, 2019, 46(7): 151-156.
[7] 单义栋, 王衡军, 黄河, 闫倩. 基于注意力机制的命名实体识别模型研究——以军事文本为例[J]. 计算机科学, 2019, 46(6A): 111-114.
[8] 王坤, 段湘煜. 倾向近邻关联的神经机器翻译[J]. 计算机科学, 2019, 46(5): 198-202.
[9] 李杰, 凌兴宏, 伏玉琛, 刘全. 基于视觉注意力机制的异步优势行动者-评论家算法[J]. 计算机科学, 2019, 46(5): 169-174.
[10] 凡子威, 张民, 李正华. 基于BiLSTM并结合自注意力机制和句法信息的隐式篇章关系分类[J]. 计算机科学, 2019, 46(5): 214-220.
[11] 邓珍荣, 张宝军, 蒋周琴, 黄文明. 融合word2vec和注意力机制的图像描述模型[J]. 计算机科学, 2019, 46(4): 268-273.
[12] 李浩, 刘永坚, 解庆, 唐伶俐. 基于多层次注意力机制的远程监督关系抽取模型[J]. 计算机科学, 2019, 46(10): 252-257.
[13] 韩旭丽, 曾碧卿, 曾锋, 张敏, 商齐. 基于词嵌入辅助机制的情感分析[J]. 计算机科学, 2019, 46(10): 258-264.
[14] 罗达, 苏锦钿, 李鹏飞. 基于多角度注意力机制的单一事实知识库问答方法[J]. 计算机科学, 2019, 46(10): 215-221.
[15] 周枫, 李荣雨. 基于BGRU池的卷积神经网络文本分类模型[J]. 计算机科学, 2018, 45(6): 235-240.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 雷丽晖,王静. 可能性测度下的LTL模型检测并行化研究[J]. 计算机科学, 2018, 45(4): 71 -75, 88 .
[2] 夏庆勋,庄毅. 一种基于局部性原理的远程验证机制[J]. 计算机科学, 2018, 45(4): 148 -151, 162 .
[3] 刘琴. 计算机取证过程中基于约束的数据质量问题研究[J]. 计算机科学, 2018, 45(4): 169 -172 .
[4] 胡庆成, 张勇, 邢春晓. 基于有重叠社区划分的社会网络影响最大化方法研究[J]. 计算机科学, 2018, 45(6): 32 -35 .
[5] 吴伟男, 刘建明. 面向低功耗无线传感器网络的动态重传算法[J]. 计算机科学, 2018, 45(6): 96 -99,123 .
[6] 黄一龙, 李培峰, 朱巧明. 事件因果与时序关系识别的联合推理模型[J]. 计算机科学, 2018, 45(6): 204 -207,234 .
[7] 沈夏炯, 张俊涛, 韩道军. 基于梯度提升回归树的短时交通流预测模型[J]. 计算机科学, 2018, 45(6): 222 -227,264 .
[8] 周枫, 李荣雨. 基于BGRU池的卷积神经网络文本分类模型[J]. 计算机科学, 2018, 45(6): 235 -240 .
[9] 钟锐, 吴怀宇, 何云. 基于局部融合特征与分层增量树的快速人脸识别算法[J]. 计算机科学, 2018, 45(6): 308 -313 .
[10] 李童悦,马文平. WSN中基于非线性自适应PSO的分簇策略[J]. 计算机科学, 2018, 45(5): 44 -48 .