计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 91-97.doi: 10.11896/jsjkx.200900015

• 人工智能* 上一篇    下一篇

融合BERT和记忆网络的实体识别

陈德, 宋华珠, 张娟, 周泓林   

  1. 武汉理工大学计算机科学与技术学院 武汉430070
  • 收稿日期:2020-09-02 修回日期:2020-12-06 出版日期:2021-10-15 发布日期:2021-10-18
  • 通讯作者: 宋华珠(shuaz@whut.edu.cn)
  • 作者简介:chende@whut.edu.cn
  • 基金资助:
    国家科技部科技基础性工作专项(2014FY110900)

Entity Recognition Fusing BERT and Memory Networks

CHEN De, SONG Hua-zhu, ZHANG Juan, ZHOU Hong-lin   

  1. School of Computer Science and Technology,Wuhan University of Technology,Wuhan 430070,China
  • Received:2020-09-02 Revised:2020-12-06 Online:2021-10-15 Published:2021-10-18
  • About author:CHEN De,born in 1992,postgraduate.His main research interest includes na-tural language processing and so on.
    SONG Hua-zhu,born in 1970,Ph.D,associate professor,master supervisor,is a senior member of China Computer Federation.Her main research interests include artificial intelligent and data mining,semantic and knowledge abstraction.
  • Supported by:
    National Special Scientific and Technological Basic Work of the Ministry of Science and Technology(2014FY110900).

摘要: 实体识别是信息提取的子任务,传统实体识别模型针对人员、组织、位置名称等类型的实体进行识别,而在现实世界中必须考虑更多类别的实体,需要细粒度的实体识别。同时,BiGRU等传统实体识别模型无法充分利用更大范围内的全局特征。文中提出了一种基于命名记忆网络和BERT的实体识别模型,记忆网络模块能够记忆更大范围的特征,BERT语言预训练模型能进行更好的语义表示。对水泥熟料生产语料数据进行实体识别,实验结果表明,所提方法能够识别实体且较其他传统模型更具优势。为了进一步验证所提模型的性能,在CLUENER2020数据集上进行实验,结果表明,在BiGRU-CRF模型的基础上使用BERT和记忆网络模块进行优化是能够提高实体识别效果的。

关键词: BERT, BiGRU-CRF, 记忆网络, 实体识别

Abstract: Entity recognition is a sub task of information extraction.The traditional entity recognition model is used to identify entities of personnel,organization,location and name.In the real world,more types of entities must be considered,and fine-grained entity recognition is needed.At the same time,traditional entity recognition models such as BiGRU cannot make full use of the global features in a wider range.This paper presents an entity recognition model based on memory network and BERT.The pre-training language model of BERT is used for better semantic representation,and the memory network module can memorize a wider range of features.The results of entity recognition for cement clinker production corpus data show that this method can re-cognize entities and has some advantages over other traditional models.In order to further verify the model in this paper,experiments are carried out on the CLUENER2020 dataset.The results show that the optimization based on BiGRU-CRF model using BERT and memory network module can improve the effect of entity recognition.

Key words: BERT, BiGRU-CRF, Entity recognition, Memory network

中图分类号: 

  • TP391
[1]GAO B T,ZHANG Y,LIU B.BioTrHMM:Biomedical named entity recognition algorithm based on transfer learning[J].Application Research of Computers,2019,36(1):45-48.
[2]YU X,ZHANG J,QIU W S,et al.Research on medical literature risk event extraction based on sequence annotation algorithm comparison[J].Computer Applications and Software,2017,34:12.
[3]ZHOU X J,XU C M,RUAN T.Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records[J].Computer Science,2021,48(4):237-242.
[4]MCCALLUM A,LI W.Early results for named entity recognition with conditional random fields,feature induction and web-enhanced lexicons[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL.Stroudsburg:Association for Computational Linguistics,2003(4):188-191.
[5]ISOZAKI H,KAZAWA H.Efficient support vector classifiersfor named entity recognition[C]// Proceedings of the 19th International Conference on Computational Linguistics.Stroudsburg:Association for Computational Linguistics,2002(1):1-7.
[6]LI S,LI W,COOK C,et al.Independently recurrent neural network (indrnn):Building a longer and deeper rnn[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5457-5466.
[7]TAI K S,SOCHER R,MANNING C D.Improved SemanticRepresentations From Tree-Structured Long Short-Term Me-mory Networks[C]//Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1:Long Papers).2015:1556-1566.
[8]CHUNG J,GULCEHRE C,CHO K H,et al.Empirical evaluation of gated recurrent neural networks on sequence modeling [EB/OL].https://arxiv.org/pdf/1412.3555.pdf,2018.
[9]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextua-lized word representations [C]//Proceedings of NAACL-HLT.2018:2227-2237.
[10]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1 (Long and Short Papers).2019:4171-4186.
[11]HUANG Z,XU W,YU K.Bidirectional LSTM-CRF models for sequence tagging[EB/OL].https://arxiv.org/pdf/1508.01991.pdf,2015.
[12]STRUBELL E,VERGA P,BELANGER D,et al.Fast and Accurate Entity Recognition with Iterated Dilated Convolutions[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:2670-2680.
[13]LING X,WELD D S.Fine-grained entity recognition[C]//Proceedings of the Twenty-Sixth AAAI Conference on Artificial Intelligence.2012:94-100.
[14]MAI K,PHAM T H,NGUYEN M T,et al.An empirical study on fine-grained named entity recognition[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:711-722.
[15]LIU X,CHEN Q,DENG C,et al.Lcqmc: A large-scale chinese question matching corpus[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:1952-1962.
[16]MILLER A,FISCH A,DODGE J,et al.Key-Value MemoryNetworks for Directly Reading Documents[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:1400-1409.
[17]ZHANG H N,WU D Y,LIU Y,et al.Chinese named entity re-cognition based on deep neural network[J].Journal of Chinese Information Processing,2017,31(4):28-35.
[18]DONG Z,SHAO R Q,CHEN Y L,et al.Named Entity Recognition in Food Field Based on BERT and Adversarial Training[J].Computer Science,2021,48(5):247-253.
[19]ZHANG D,CHEN W L.Chinese Named Entity RecognitionBased on Contextualized Char Embeddings[J].Computer Science,2021,48(3):233-238.
[20]ZHANG D,WANG M T,CHEN W L.Named Entity Recognition Combining Wubi Glyphs with Contextualized Character Embeddings[J].Computer Engineering,2021,47(3):94-101.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 王馨彤, 王璇, 孙知信.
基于多尺度记忆残差网络的网络流量异常检测模型
Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network
计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[3] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[4] 赵冬梅, 吴亚星, 张红斌.
基于IPSO-BiLSTM的网络安全态势预测
Network Security Situation Prediction Based on IPSO-BiLSTM
计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103
[5] 康雁, 徐玉龙, 寇勇奇, 谢思宇, 杨学昆, 李浩.
基于Transformer和LSTM的药物相互作用预测
Drug-Drug Interaction Prediction Based on Transformer and LSTM
计算机科学, 2022, 49(6A): 17-21. https://doi.org/10.11896/jsjkx.210400150
[6] 于家畦, 康晓东, 白程程, 刘汉卿.
一种新的中文电子病历文本检索模型
New Text Retrieval Model of Chinese Electronic Medical Records
计算机科学, 2022, 49(6A): 32-38. https://doi.org/10.11896/jsjkx.210400198
[7] 杜晓明, 袁清波, 杨帆, 姚奕, 蒋祥.
军事指控保障领域命名实体识别语料库的构建
Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support
计算机科学, 2022, 49(6A): 133-139. https://doi.org/10.11896/jsjkx.210400132
[8] 康雁, 吴志伟, 寇勇奇, 张兰, 谢思宇, 李浩.
融合Bert和图卷积的深度集成学习软件需求分类
Deep Integrated Learning Software Requirement Classification Fusing Bert and Graph Convolution
计算机科学, 2022, 49(6A): 150-158. https://doi.org/10.11896/jsjkx.210500065
[9] 余本功, 张子薇, 王惠灵.
一种融合多层次情感和主题信息的TS-AC-EWM在线商品排序方法
TS-AC-EWM Online Product Ranking Method Based on Multi-level Emotion and Topic Information
计算机科学, 2022, 49(6A): 165-171. https://doi.org/10.11896/jsjkx.210400238
[10] 王杉, 徐楚怡, 师春香, 张瑛.
基于CNN-LSTM的卫星云图云分类方法研究
Study on Cloud Classification Method of Satellite Cloud Images Based on CNN-LSTM
计算机科学, 2022, 49(6A): 675-679. https://doi.org/10.11896/jsjkx.210300177
[11] 刘宝宝, 杨菁菁, 陶露, 王贺应.
基于DE-LSTM模型的教育统计数据预测研究
Study on Prediction of Educational Statistical Data Based on DE-LSTM Model
计算机科学, 2022, 49(6A): 261-266. https://doi.org/10.11896/jsjkx.220300120
[12] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[13] 郭雨欣, 陈秀宏.
融合BERT词嵌入表示和主题信息增强的自动摘要模型
Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement
计算机科学, 2022, 49(6): 313-318. https://doi.org/10.11896/jsjkx.210400101
[14] 高堰泸, 徐圆, 朱群雄.
基于A-DLSTM夹层网络结构的电能消耗预测方法
Predicting Electric Energy Consumption Using Sandwich Structure of Attention in Double -LSTM
计算机科学, 2022, 49(3): 269-275. https://doi.org/10.11896/jsjkx.210100006
[15] 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松.
基于交互注意力图卷积网络的方面情感分类
Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification
计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!