面向国防科技领域的技术和术语识别方法研究

doi:10.11896/jsjkx.190300069

Abstract

Abstract: With the rapid development of natural language processing,constructing oriented national defense science (ONDS) technology knowledge base has received more and more attention.The identification of technology and terminology is fundamental for constructing ONDS technology knowledge base.To recognize technology and terminology,this paper explored the application of subwords in the traditional Bi-LSTM+CRF sequence labeling model.In addition,this paper proposed linguistic features to boost the performance.Experimental results on the annotated dataset show that the proposed approach achieves 71.8% F1 scores,with improvement of 3.04% over the baseline system,indicating the effectiveness of the proposed approach in recognizing ONDS technology and terminology.Meanwhile,it also outperforms BERT-driven models in recognizing technology and terminology.

Key words: Bi-LSTM+CRF model, Linguistic features, Oriented national defense science, Subwords, Technology and terminology

CLC Number:

TP391.1

FENG Luan-luan, LI Jun-hui, LI Pei-feng, ZHU Qiao-ming. Technology and Terminology Detection Oriented National Defense Science[J].Computer Science, 2019, 46(12): 231-236.

References

[1]SANG K T,MEULDER D F.Introduction to the conll-2003 shared task:Language-independent named entity recognition[C]//Proceedings of the 2003 Conference on Natural Language Learning.2003:142-147.
[2]CHINCHOR N.MUC-6 named entity task definition (version2.1) [C]//Proceedings of the 6th Conference on Message Understanding.Columbia,Maryland,1995.
[3]COLLINS M,SINGER Y.Unsupervised models for named entity classification[C]//Proceedings of the Joint SIGDAT Confe-rence on Empirical Methods in Natural Language Processing and Very Large Corpora.1999:100-110.
[4]ZHOU G D,SU J.Named Entity Recognition using an HMM-based Chunk Tagger[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.ACL,2002:473-480.
[5]BURGER J D,HENDERSON J C,MORGAN W T.Statistical named entity recognizer adaptation[C]//Proceedings of the 6th Conference on Natural Language Learning.Stroudsburg:Associa-tion for Computational Linguistics,2002:1-4.
[6]CHIEU H T,NG H T.Named Entity Recognition with a Maximum Entropy Approach[C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL.2003:160-163.
[7]CURRAN J R,CLARK S.Language independent NER using a maximum entropy tagger[C]//Proceedings of the Conference on Natural Language Learning at HLT-NAACL.2003:164-167.
[8]EKBAL A,BANDYOPADHYAY S.Named entity recognition using support vector machine:A language independent approach[J].International Journal of Electrical and Electronics Engineering,2010,4(2):155-170.
[9]MAYFIELD J,MCNAMEE P,PIATKO C.Named entity recognition using hundreds of thousands of features[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL.Stroudsburg:Association for Computational Linguistics,2003:184-187.
[10]MCCALLUM A,LI W.Early results for Named Entity Recognition with Conditional Random Fields,Feature Induction and Web-Enhanced Lexicons[C]//Proceedings of the 7thConfe-rence on Natural Language Learning at HLT-NAACL.Stroud-sburg:Association for Computational Linguistics,2003:188-191.
[11]HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF Models for Sequence Tagging[EB/OL].[2015-08-09].https://arxiv.org/pdf/1508.01991.pdf.
[12]MA X Z,HOVY E.End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.ACL,2016:1064-1074.
[13]BHARADWAJ A,MORTENSEN D,DYER C,et al.Phonologically aware neural model for named entity recognition in low resource transfer settings[C]//Proceedings of the 2016 Confe-rence on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2016:1462-1472.
[14]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of NAACL-HLT 2018.New Orleans:Association for Computational Linguistics,2018:2227-2237.
[15]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[EB/OL].[2019-05-24].https://arxiv.org/pdf/1810.04805.pdf.
[16]AKBIK A,BLYTHE D,VOLLGRAF R.Contextual String Embeddings for Sequence Labeling[C]//Proceedings of the 27th International Conference on Computational Linguistics.Santa Fe,New Mexico,USA:Association for Computational Linguistics,2018:1638-1649.
[17]GUO J K,BRACKLE D V,LOFASO N,et al.Extracting mea- ningful entities from human-generated tactical reports[J].Procedia Computer Science,2015,6(1):72-79.
[18]SHAN H Y,ZHANG H S,WU Z L.A Military Named Entity Recognition Method Based on CRFs with Small Granularity Strategy[J].Journal of Academy of Armored Force Enginee-ring,2017,31(1):87-88.(in Chinese)
单赫源,张海粟,吴照林.小粒度策略下基于CRFs的军事命名实体识别方法[J].装甲兵工程学院学报,2017,31(1):87-88.
[19]FENG Y T,ZHANG H J,HAO W N.Named Entity Recognition for Military Text[J].Computer Science,2015,42(7):15-18,47.(in Chinese)
冯蕴天,张宏军,郝文宁.面向军事文本的命名实体识别[J].计算机科学,2015,42(7):15-18,47.
[20]WANG X F,YANG R P,ZHU W.Military Named Entity Reco- gnition Method Based on Deep Learning[J].Journal of Academy of Armored Force Engineering,2018,32(4):94-98.(in Chinese)
王学锋,杨若鹏,朱巍.基于深度学习的军事命名实体识别方法[J].装甲兵工程学院学报,2018,32(4):94-98.
[21]MIKOLOV T,YIH W T,ZWEIG G.Linguistic regularities in continuous space word representations[C]//Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Atlanta,Georgia:Association for Computational Linguistics,2013:746-751.
[22]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al. Neural architectures for named entity recognition[C]//Procee-dings of NAACL-HLT.San Diego,California,2016:260-270.
[23]YANG J,LIANG S L,ZHANG Y.Design challenges and misconceptions in neural sequence labeling[C]//Proceedings of the 27th International Conference on Computational Linguistics (COLING).2018.
[24]SENNRICH R,HADDOW B,BIRCH A.Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).Berlin,Germany,2016.
[25]SENNRICH R,HADDOW B.Linguistic Input Features Improve Neural Machine Translation[EB/OL].(2016-06-27).https://arxiv.org/pdf/1606.02892.pdf.
[26]GAN L X,WAN C X,LIU D X,et al.Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J].Journal of Computer Research and Development,2016,53(2):284-302.(in Chinese)
甘丽新,万常选,刘德喜,等.基于句法语义特征的中文实体关系抽取[J].计算机研究与发展,2016,53(2):284-302.
[27]SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:A simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958.

Related Articles 15

[1]	WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[2]	GUO Yu-xin, CHEN Xiu-hong. Automatic Summarization Model Combining BERT Word Embedding Representation and Topic Information Enhancement [J]. Computer Science, 2022, 49(6): 313-318.
[3]	HUANG Shao-bin, SUN Xue-wei, LI Rong-sheng. Relation Classification Method Based on Cross-sentence Contextual Information for Neural Network [J]. Computer Science, 2022, 49(6A): 119-124.
[4]	MIU Feng, WANG Ping, LI Tai-yong. Implicit Causality Extraction Method Based on Event Action Direction [J]. Computer Science, 2022, 49(3): 276-280.
[5]	XIAO Kang, ZHOU Xia-bing, WANG Zhong-qing, DUAN Xiang-yu, ZHOU Guo-dong, ZHANG Min. Review Question Generation Based on Product Profile [J]. Computer Science, 2022, 49(2): 272-278.
[6]	MA Jian-hong, ZHANG Tong. Expert Recommendation Algorithm for Enterprise Engineering Problems [J]. Computer Science, 2022, 49(1): 159-165.
[7]	YUAN Jing-ling, DING Yuan-yuan, SHENG De-ming, LI Lin. Image-Text Sentiment Analysis Model Based on Visual Aspect Attention [J]. Computer Science, 2022, 49(1): 219-224.
[8]	LIU Kai, ZHANG Hong-jun, CHEN Fei-qiong. Name Entity Recognition for Military Based on Domain Adaptive Embedding [J]. Computer Science, 2022, 49(1): 292-297.
[9]	ZOU Ao, HAO Wen-ning, JIN Da-wei, CHEN Gang, TIAN Yuan. Study on Text Retrieval Based on Pre-training and Deep Hash [J]. Computer Science, 2021, 48(11): 300-306.
[10]	YU Liang, WEI Yong-feng, LUO Guo-liang, WU Chang-xing. Knowledge Distillation Based Implicit Discourse Relation Recognition [J]. Computer Science, 2021, 48(11): 319-326.
[11]	LI Jian-lan, PAN Yue, LI Xiao-cong, LIU Zi-wei, WANG Tian-yu. Chinese Commentary Text Research Status and Trend Analysis Based on CiteSpace [J]. Computer Science, 2021, 48(11A): 17-21.
[12]	ZHANG Ming-yang, WANG Gang, PENG Qi, ZHANG Yan-feng. Data Analysis of OpenReview [J]. Computer Science, 2021, 48(6): 63-70.
[13]	SHI Wei, FU Yue. Microblog Short Text Mining Considering Context:A Method of Sentiment Analysis [J]. Computer Science, 2021, 48(6A): 158-164.
[14]	PEI Ying, LI Tian-xiang, WANG Ao-qing, FU Jia-sheng, HAN Xiao-song. Prediction Method of International Natural Gas Price Trends Based on News [J]. Computer Science, 2021, 48(6A): 235-239.
[15]	HUO Shuai, PANG Chun-jiang. Research on Sentiment Analysis Based on Transformer and Multi-channel Convolutional Neural Network [J]. Computer Science, 2021, 48(6A): 349-356.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Technology and Terminology Detection Oriented National Defense Science

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0