计算机科学 ›› 2020, Vol. 47 ›› Issue (11): 212-219.doi: 10.11896/jsjkx.191000201
赵丰, 黄健, 张中杰
ZHAO Feng, HUANG Jian, ZHANG Zhong-jie
摘要: 对文本进行分词和词嵌入通常是中文命名实体识别的第一步,但中文的词与词之间没有明确的分界符,专业词及生僻词等未收录词(Out of Vocabulary,OOV)严重干扰了词向量的计算,基于词向量嵌入的模型性能极易受到分词效果的影响。同时现有模型大多使用循环神经网络,计算速度较慢,很难达到工业应用的要求。针对上述问题,构建了一个基于注意力机制和卷积神经网络的命名实体识别模型,即LAC-DGLU。针对分词依赖的问题,提出了一种基于局部注意力卷积(Local Attention Convolution,LAC)的字嵌入算法,减轻了模型对分词效果的依赖。针对计算速度较慢的问题,使用了一种带门结构的卷积神经网络,即膨胀门控线性单元(Dilated Gated Linear Unit,DGLU),提高了模型的计算速度。在多个数据集上的实验结果显示,该模型相比现有最优模型F1值提高了0.2%~2%,训练速度可以达到现有最优模型的1.4~1.9倍。
中图分类号:
[1] LEVY O,GOLDBERG Y.Neural word embedding asimplicitmatrix factorization[C]//Advances in Neural Information Processing Systems.2014:2177-2185. [2] MIKOLOV T,KARAFIÁT M,BURGETL,et al.Recurrentneural network based language model[C]//Eleventh Annual Conference of the International Speech Communication Association.2010. [3] YAO K,PENG B,ZWEIG G,et al.Recurrent conditional ran-dom field for language understanding[C]//2014 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2014:4077-4081. [4] SUNDERMEYER M,SCHLÜTER R,NEY H.LSTM neuralnetworks for language modeling[C]//Thirteenth Annual Conference of the International Speech Communication Association.2012. [5] HUANG Z,XU W,YU K.Bidirectional LSTM-CRF models forsequence tagging[J].arXiv:1508.01991,2015. [6] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008. [7] LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al.Neural Architectures for Named Entity Recognition[C]//Proceedings of NAACL-HLT.2016:260-270. [8] SANG E F,DE MEULDER F.Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition[J].arXiv:cs/0306050,2003. [9] STRUBELL E,VERGA P,BELANGER D,et al.Fast and Ac-curate Entity Recognition with Iterated Dilated Convolutions[C]//Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing.2017:2670-2680. [10] SHEN T,ZHOU T,LONGG,et al.Disan:Directional self-attention network for rnn/cnn-free language understanding[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2018. [11] PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of NAACL-HLT.2018:2227-2237. [12] DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [13] FENG Y H,YU H,SUN G,et al.Named Entity Recognition Method Based on BLSTM[J].Computer Science,2018,45(2):261-268. [14] SHAN Y D,WANG H J,HUANG H,et al.Study on NamedEntity Recognition Model Based on Attention Mechanism-Taking Military Text as Example[J].Computer Science,2019,46(S1):111-114,119. [15] PENG N,DREDZEM.Named entity recognition for chinese social media with jointly trained embeddings[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:548-554. [16] ZHANG Y,YANG J.Chinese NER Using Lattice LSTM[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1:Long Papers).2018:1554-1564. [17] CAO P,CHEN Y,LIUK,et al.Adversarial Transfer Learning for Chinese Named Entity Recognition with Self-Attention Mechanism[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:182-192. [18] DONG C,ZHANG J,ZONG C,et al.Character-based LSTM-CRF with radical-level features for Chinese named entity recognition[M]//Natural Language Understanding and Intelligent Applications.Springer,Cham,2016:239-250. [19] YANG F,ZHANG J,LIU G,et al.Five-Stroke Based CNN-BiRNN-CRF Network for Chinese Named Entity Recognition[C]//CCF International Conference on Natural Language Processing and Chinese Computing.Springer,Cham,2018:184-195. [20] DAUPHIN Y N,FAN A,AULIM,et al.Language modelingwith gated convolutional networks[C]//Proceedings of the 34th International Conference on Machine Learning-Volume 70.JMLR.org,2017:933-941. [21] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2016:770-778. [22] YU A W,DOHAN D,LUONG M T,et al.Qanet:Combining local convolution with global self-attention for reading comprehension[J].arXiv:1804.09541,2018. [23] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015. [24] BRAUN S.LSTM Benchmarks for Deep Learning Frameworks[J].arXiv:1806.01818,2018. [25] LIU L,JIANG H,HE P,et al.On the variance of the adaptive learning rate and beyond[J].arXiv:1908.03265,2019. [26] HE H,SUN X.A unified model for cross-domain and semi-su-pervised namd entity recognition in chinese social media[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017. |
[1] | 张侣, 周博文, 吴亮红. 基于改进卷积注意力模块与残差结构的SSD网络 SSD Network Based on Improved Convolutional Attention Module and Residual Structure 计算机科学, 2022, 49(3): 211-217. https://doi.org/10.11896/jsjkx.201200019 |
[2] | 吴昊昊, 王方石. 多尺度膨胀卷积在图像分类中的应用 Application of Multi-scale Dilated Convolution in Image Classification 计算机科学, 2020, 47(6A): 166-171. https://doi.org/10.11896/JsJkx.190600179 |
[3] | 吕培建, 陈佳鹏, 袁飞, 彭强, 项煜. 基于上下文以及多尺度信息融合的目标检测算法 Object Detection Algorithm Based on Context and Multi-scale Information Fusion 计算机科学, 2019, 46(6A): 279-283. |
|