Computer Science ›› 2022, Vol. 49 ›› Issue (1): 292-297.doi: 10.11896/jsjkx.201100007

• Artificial Intelligence • Previous Articles     Next Articles

Name Entity Recognition for Military Based on Domain Adaptive Embedding

LIU Kai1, ZHANG Hong-jun2, CHEN Fei-qiong1   

  1. 1 School of Graduate,Army Engineering University of PLA,Nanjing 210000,China
    2 College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210000,China
  • Received:2020-11-02 Revised:2021-03-17 Online:2022-01-15 Published:2022-01-18
  • About author:LIU Kai,born in 1996,postgraduate,His main research interests include na-tural language processing and so on.
    ZHANG Hong-jun,born in 1963,professor,Ph.D supervisor.His main research interests include military modeling & simulation and data engineering.

Abstract: In order to solve the poor quality problem of domain embedding space caused by inadequate military corpus which makes low accuracy of applying deep neural network model to military named entity recognition,this paper introduces a domain adaptive method to help learn the embedding of military fields from more useful information of additional fields through distributed representation of words.First,we establish the domain dictionary and combine CRF algorithm to perform domain adaptive word segment with the collected general domain and military areas corpus as training corpus for embedding,and word vectors are used as features and spliced with character vectors to enrich the embedding information and to validate the effect of word segmentation.Then the domain adaptive transformation is carried out to the heterogeneous embedded space of the general domain and the military domain,and the domain adaptive embedding is generated,as the input to BiLSTM-CRF layer of base model.At last,the recognition evaluation is carried out through CoNLL-2000.The experimental results show that,under the same model,the recognition precision rate (P),recall rate (R),and integrated F1 value (F1) of the proposed method are improved by 2.17%,1.04%,and 1.59%,respectively,compared with the military field embedding trained by a corpus which is obtained from general word segmentation.

Key words: Character embedding, Chinese word segmentation, Domain adaptation, Named entity recognition, Word embedding

CLC Number: 

  • TP391.1
[1]HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF modelsfor sequence tagging[EB/OL].(2015-08-09)[2020-10-01].https://arxiv.org/pdf/1508.01991.
[2]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al.Neural architectures for named entity recognition[C]//Procee-dings of the 15th Annual Conference of the North American Chapter of the Association for Computational Linguistics.2016:260-270.
[3]REI M,CRICHTON G,PYYSALO S.Attending to characters in neural sequence labeling models[C]//Proceedings of the 26th International Conference on Computational Linguistics.2016:309-318.
[4]XU K,WANG Q,LI Z Z,et al.Biomedical named entity recognition based on BiGRU network with multi-head attention mechanism[J].Computer applications and software,2020,37(5):151-232.
[5]ZHANG D,CHEN W L.Chinese Named Entity RecognitionBased on Contextualized Char Embeddings[J].Computer Scien-ce,2021,48(3):233-238.
[6]DEVLIN J,CHANG M W,LEE K,et al.Bert:pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186.
[7]JIANG W Z,GU J J,HU W X,et al.Military named entity re-cognition based on multi-models[J].Ordnance Industry Automation,2011,30(10):90-93.
[8]QIN J,CAO L,PENG H,et al.A domain feature word vector description method for military texts[J].Computer Enginee-ring,2016,42(8):160-165.
[9]ZHANG X H,CAO X W,GAO Y.Named Entity Recognition for Combat Documents Based on Deep Learning[J].Command Control & Simulation,2019,41(4):22-16.
[10]SHAN Y D,WANG H J,HUANG H,et al.Study on Named Entity Recognition Model Based on Attention Mechanism-Ta-king Military Text as Example[J].Computer Science,2019,46(z1):111-114.
[11]PAN S J,QIANG Y.A Survey on Transfer Learning[J].IEEE Transactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[12]WEISS K,KHOSHGOFTAAR T M,WANG D D.A survey of transfer learning[J].Journal of Big Data,2016,3(1):9.
[13]GUO T K.A research on Chinese Word Segmentation based on Dictionary[D].Harbin:Harbin University of Science and Technology,2010.
[14]ZHANG J.A Chinese Word Segmentation Method Based onRules[J].Computer and Modernization,2005(4):18-20.
[15]ZHAO Y Z.A Chinese word segmentation method based onword frequency statistics[J].Science and Technology,2016,26(10):283.
[16]STENETORP P,SOYER H,PYYSALO P,et al.Size(and domain)matters:Evaluating semantic word space representations for biomedical text[C]//Proceedings of the 5th International Symposium on Semantic Mining in Biomedicine.2012.
[17]ZHANG M S,CHE W X,LIU T.Combining statistical model and dictionary for domain adaption of Chinese word segmentation[J].Journal of Chinese Information Processing,2012,26(2):8-12.
[18]XUE N.Chinese word segmentation as character tagging[J].International Journal of Computational Linguistics and Chinese Language Processing,2003,8(1):28-48.
[19]XIE Z N.Research on Chinese name entity recognition algorithm[D].Hangzhou:Zhejiang University,2017.
[20]LI W K,LI W,WU Y F.Combination methods of Chinese cha-racter and word embeddings in deep learning[J].Journal of Chinese Information Processing,2017,31(6):140-146.
[21]TAN L C,ZHANG H T,SMUCKER M,et al.Lexical comparison between wikipedia and twitter corpora by using word embeddings[C]//Proceedings of ACL.2015.
[22]LIN B Y,LU W.Netural adaptation layers for cross-dominnamed entity recognition[C]//Proceedings of the 2018 Confe-rence on EMNLP.2018:2012-2022.
[23]MIKOLOV T,SUTSKEVER I,CHEN K,et al.DistributedRepresentations of Words and Phrases and their Compositiona-lity[C]//Proceedings of Neural Information Procesing Systems Foundation.2013.
[24]MIKOLOV T,CORRADO G,CHEN K,et al.Efficient Estimation of Word Representations in Vector Space[C]//Proceedings of the ICLR.2013:1-12.
[25]PETERS M,NEUMANN M,IYYER M,et al.Deep contextua-lized word representations[C]//Proceedings of NAACL-HLT.2018:2227-2237.
[1] HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[2] DU Xiao-ming, YUAN Qing-bo, YANG Fan, YAO Yi, JIANG Xiang. Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support [J]. Computer Science, 2022, 49(6A): 133-139.
[3] HAN Hong-qi, RAN Ya-xin, ZHANG Yun-liang, GUI Jie, GAO Xiong, YI Meng-lin. Study on Cross-media Information Retrieval Based on Common Subspace Classification Learning [J]. Computer Science, 2022, 49(5): 33-42.
[4] LI Yu-qiang, ZHANG Wei-jiang, HUANG Yu, LI Lin, LIU Ai-hua. Improved Topic Sentiment Model with Word Embedding Based on Gaussian Distribution [J]. Computer Science, 2022, 49(2): 256-264.
[5] NING Qiu-yi, SHI Xiao-jing, DUAN Xiang-yu, ZHANG Min. Unsupervised Domain Adaptation Based on Style Aware [J]. Computer Science, 2022, 49(1): 271-278.
[6] LI Zhao-qi, LI Ta. Query-by-Example with Acoustic Word Embeddings Using wav2vec Pretraining [J]. Computer Science, 2022, 49(1): 59-64.
[7] WU Lan, WANG Han, LI Bin-quan. Unsupervised Domain Adaptive Method Based on Optimal Selection of Self-supervised Tasks [J]. Computer Science, 2021, 48(6A): 357-363.
[8] YU Sheng, LI Bin, SUN Xiao-bing, BO Li-li, ZHOU Cheng. Approach for Knowledge-driven Similar Bug Report Recommendation [J]. Computer Science, 2021, 48(5): 91-98.
[9] DONG Zhe, SHAO Ruo-qi, CHEN Yu-liang, ZHAI Wei-feng. Named Entity Recognition in Food Field Based on BERT and Adversarial Training [J]. Computer Science, 2021, 48(5): 247-253.
[10] ZHOU Xiao-jin, XU Chen-ming, RUAN Tong. Multi-granularity Medical Entity Recognition for Chinese Electronic Medical Records [J]. Computer Science, 2021, 48(4): 237-242.
[11] ZHANG Dong, CHEN Wen-liang. Chinese Named Entity Recognition Based on Contextualized Char Embeddings [J]. Computer Science, 2021, 48(3): 233-238.
[12] MA Chuang, TIAN Qing, SUN He-yang, CAO Meng, MA Ting-huai. Unsupervised Domain Adaptation Based on Weighting Dual Biases [J]. Computer Science, 2021, 48(2): 217-223.
[13] YU Shi-yuan, GUO Shu-ming, HUANG Rui-yang, ZHANG Jian-peng, SU Ke. Overview of Nested Named Entity Recognition [J]. Computer Science, 2021, 48(11A): 1-10.
[14] ZHANG Yu-shuai, ZHAO Huan, LI Bo. Semantic Slot Filling Based on BERT and BiLSTM [J]. Computer Science, 2021, 48(1): 247-252.
[15] TIAN Ye, SHOU Li-dan, CHEN Ke, LUO Xin-yuan, CHEN Gang. Natural Language Interface for Databases with Content-based Table Column Embeddings [J]. Computer Science, 2020, 47(9): 60-66.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!