Computer Science ›› 2020, Vol. 47 ›› Issue (11): 212-219.doi: 10.11896/jsjkx.191000201

• Artificial Intelligence • Previous Articles     Next Articles

LAC-DGLU:Named Entity Recognition Model Based on CNN and Attention Mechanism

ZHAO Feng, HUANG Jian, ZHANG Zhong-jie   

  1. College of Artificial Intelligence,National University of Defense Technology,Changsha 410073,China
  • Received:2019-10-31 Revised:2020-03-29 Online:2020-11-15 Published:2020-11-05
  • About author:ZHAO Feng,born in 1997,postgradua-te.His main research interests include natural language processing and so on.
    HUANG Jian,born in 1971,Ph.D,professor,Ph.D supervisor.Her main research interests include complex system modeling and so on.

Abstract: Text segmentation and word embedding are usually the first step in Chinese named entity recognition,but there is no clear delimiter between Chinese words and words.OOV(out of vocabulary) words such as professional words and uncommon words are severely disturbing the computation of word vectors.Model performance based on word vector embedding is highly susceptible to word segmentation effects.At the same time,most of the existing models use low-speed recurrent neural network which is difficult to meet the requirements of industrial applications.Aiming at the above problems,this paper constructs a named entity recognition model based on attention mechanism and convolutional neural network:LAC-DGLU.To handel the problem of word segmentation,this paper proposes a word embedding algorithm based on Local Attention Convolution (LAC),which alle-viates the dependence of the model on the effect of word segmentation.For the problem of slow calculation speed,this paper uses aconvolutional neural network with gate structure:Dilated Gated Linear Unit (DGLU) to improve the speed of model calculation.The experimental results on several datasets show that the model can increase the F1 value by 0.2% to 2%compared with the existing mainstream model,and the calculation speed can reach more than 1.4to 1.9 times of the existing mainstream model.

Key words: Character embedding, Dilated convolution, Gated linear unit, Local attention convolution, Residual structure

CLC Number: 

  • TP391
[1] LEVY O,GOLDBERG Y.Neural word embedding asimplicitmatrix factorization[C]//Advances in Neural Information Processing Systems.2014:2177-2185.
[2] MIKOLOV T,KARAFIÁT M,BURGETL,et al.Recurrentneural network based language model[C]//Eleventh Annual Conference of the International Speech Communication Association.2010.
[3] YAO K,PENG B,ZWEIG G,et al.Recurrent conditional ran-dom field for language understanding[C]//2014 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2014:4077-4081.
[4] SUNDERMEYER M,SCHLÜTER R,NEY H.LSTM neuralnetworks for language modeling[C]//Thirteenth Annual Conference of the International Speech Communication Association.2012.
[5] HUANG Z,XU W,YU K.Bidirectional LSTM-CRF models forsequence tagging[J].arXiv:1508.01991,2015.
[6] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[7] LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al.Neural Architectures for Named Entity Recognition[C]//Proceedings of NAACL-HLT.2016:260-270.
[8] SANG E F,DE MEULDER F.Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition[J].arXiv:cs/0306050,2003.
[9] STRUBELL E,VERGA P,BELANGER D,et al.Fast and Ac-curate Entity Recognition with Iterated Dilated Convolutions[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:2670-2680.
[10] SHEN T,ZHOU T,LONGG,et al.Disan:Directional self-attention network for rnn/cnn-free language understanding[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018.
[11] PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of NAACL-HLT.2018:2227-2237.
[12] DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[13] FENG Y H,YU H,SUN G,et al.Named Entity Recognition Method Based on BLSTM[J].Computer Science,2018,45(2):261-268.
[14] SHAN Y D,WANG H J,HUANG H,et al.Study on NamedEntity Recognition Model Based on Attention Mechanism-Taking Military Text as Example[J].Computer Science,2019,46(S1):111-114,119.
[15] PENG N,DREDZEM.Named entity recognition for chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:548-554.
[16] ZHANG Y,YANG J.Chinese NER Using Lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2018:1554-1564.
[17] CAO P,CHEN Y,LIUK,et al.Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:182-192.
[18] DONG C,ZHANG J,ZONG C,et al.Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[M]//Natural Language Understanding and Intelligent Applications.Springer,Cham,2016:239-250.
[19] YANG F,ZHANG J,LIU G,et al.Five-Stroke Based CNN-BiRNN-CRF Network for Chinese Named Entity Recognition[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Springer,Cham,2018:184-195.
[20] DAUPHIN Y N,FAN A,AULIM,et al.Language modelingwith gated convolutional networks[C]//Proceedings of the 34th International Conference on Machine Learning-Volume 70.JMLR.org,2017:933-941.
[21] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2016:770-778.
[22] YU A W,DOHAN D,LUONG M T,et al.Qanet:Combining local convolution with global self-attention for reading comprehension[J].arXiv:1804.09541,2018.
[23] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015.
[24] BRAUN S.LSTM Benchmarks for Deep Learning Frameworks[J].arXiv:1806.01818,2018.
[25] LIU L,JIANG H,HE P,et al.On the variance of the adaptive learning rate and beyond[J].arXiv:1908.03265,2019.
[26] HE H,SUN X.A unified model for cross-domain and semi-su-pervised namd entity recognition in chinese social media[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017.
[1] QU Zhong, CHEN Wen. Concrete Pavement Crack Detection Based on Dilated Convolution and Multi-features Fusion [J]. Computer Science, 2022, 49(3): 192-196.
[2] ZHANG Lyu, ZHOU Bo-wen, WU Liang-hong. SSD Network Based on Improved Convolutional Attention Module and Residual Structure [J]. Computer Science, 2022, 49(3): 211-217.
[3] LIU Kai, ZHANG Hong-jun, CHEN Fei-qiong. Name Entity Recognition for Military Based on Domain Adaptive Embedding [J]. Computer Science, 2022, 49(1): 292-297.
[4] WANG Shi-yun, YANG Fan. Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy [J]. Computer Science, 2021, 48(8): 162-168.
[5] GONG Hang, LIU Pei-shun. Detection Method of High Beam in Night Driving Vehicle [J]. Computer Science, 2021, 48(12): 256-263.
[6] YANG Kun, ZHANG Juan, FANG Zhi-jun. Multi-patch and Multi-scale Hierarchical Aggregation Network for Fast Nonhomogeneous ImageDehazing [J]. Computer Science, 2021, 48(11): 250-257.
[7] WU Hao-hao and WANG Fang-shi. Application of Multi-scale Dilated Convolution in Image Classification [J]. Computer Science, 2020, 47(6A): 166-171.
[8] ZHU Wei, WANG Tu-qiang, CHEN Yue-feng, HE De-feng. Object-level Edge Detection Algorithm Based on Multi-scale Residual Network [J]. Computer Science, 2020, 47(6): 144-150.
[9] PENG Xian, PENG Yu-xu, TANG Qiang, SONG Yan-qi. Crowd Counting Based on Single-column Multi-scale Convolutional Neural Network [J]. Computer Science, 2020, 47(4): 150-156.
[10] YANG Dan-hao,WU Yue-xin,FAN Chun-xiao. Chinese Short Text Keyphrase Extraction Model Based on Attention [J]. Computer Science, 2020, 47(1): 193-198.
[11] LV Pei-jian, CHEN Jia-peng, YUAN Fei, PENG Qiang, XIANG Yu. Object Detection Algorithm Based on Context and Multi-scale Information Fusion [J]. Computer Science, 2019, 46(6A): 279-283.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!