计算机科学 ›› 2019, Vol. 46 ›› Issue (12): 231-236.doi: 10.11896/jsjkx.190300069
冯鸾鸾, 李军辉, 李培峰, 朱巧明
FENG Luan-luan, LI Jun-hui, LI Pei-feng, ZHU Qiao-ming
摘要: 随着自然语言处理技术的发展,人们越来越重视构建面向国防科技领域的知识图谱。而面向国防科技领域的技术和术语识别是构建该领域技术知识图谱的基础。文中基于该领域的语料库,在技术和术语识别的任务上,探索了子词单元在传统序列标注Bi-LSTM+CRF模型上的应用。此外,针对任务的特点,提出了适用于技术和术语识别的语言学特征。基于该领域的语料库,实验结果表明技术和术语识别的F1值达到了71.80%,较基准系统提升了3.04%,能够较好地识别出面向国防科技领域的技术和术语。同时,所提方法也优于基于BERT模型的技术术语识别方法。
中图分类号:
[1]SANG K T,MEULDER D F.Introduction to the conll-2003 shared task:Language-independent named entity recognition[C]//Proceedings of the 2003 Conference on Natural Language Learning.2003:142-147.[2]CHINCHOR N.MUC-6 named entity task definition (version2.1) [C]//Proceedings of the 6th Conference on Message Understanding.Columbia,Maryland,1995.[3]COLLINS M,SINGER Y.Unsupervised models for named entity classification[C]//Proceedings of the Joint SIGDAT Confe-rence on Empirical Methods in Natural Language Processing and Very Large Corpora.1999:100-110.[4]ZHOU G D,SU J.Named Entity Recognition using an HMM-based Chunk Tagger[C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.ACL,2002:473-480.[5]BURGER J D,HENDERSON J C,MORGAN W T.Statistical named entity recognizer adaptation[C]//Proceedings of the 6th Conference on Natural Language Learning.Stroudsburg:Associa-tion for Computational Linguistics,2002:1-4.[6]CHIEU H T,NG H T.Named Entity Recognition with a Maximum Entropy Approach[C]//Proceedings of the 7th Conference on Natural Language Learning at HLT-NAACL.2003:160-163.[7]CURRAN J R,CLARK S.Language independent NER using a maximum entropy tagger[C]//Proceedings of the Conference on Natural Language Learning at HLT-NAACL.2003:164-167.[8]EKBAL A,BANDYOPADHYAY S.Named entity recognition using support vector machine:A language independent approach[J].International Journal of Electrical and Electronics Engineering,2010,4(2):155-170.[9]MAYFIELD J,MCNAMEE P,PIATKO C.Named entity recognition using hundreds of thousands of features[C]//Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL.Stroudsburg:Association for Computational Linguistics,2003:184-187.[10]MCCALLUM A,LI W.Early results for Named Entity Recognition with Conditional Random Fields,Feature Induction and Web-Enhanced Lexicons[C]//Proceedings of the 7thConfe-rence on Natural Language Learning at HLT-NAACL.Stroud-sburg:Association for Computational Linguistics,2003:188-191.[11]HUANG Z H,XU W,YU K.Bidirectional LSTM-CRF Models for Sequence Tagging[EB/OL].[2015-08-09].https://arxiv.org/pdf/1508.01991.pdf.[12]MA X Z,HOVY E.End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.ACL,2016:1064-1074.[13]BHARADWAJ A,MORTENSEN D,DYER C,et al.Phonologically aware neural model for named entity recognition in low resource transfer settings[C]//Proceedings of the 2016 Confe-rence on Empirical Methods in Natural Language Processing.Stroudsburg:Association for Computational Linguistics,2016:1462-1472.[14]PETERS M E,NEUMANN M,IYYER M,et al.Deep contextualized word representations[C]//Proceedings of NAACL-HLT 2018.New Orleans:Association for Computational Linguistics,2018:2227-2237.[15]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[EB/OL].[2019-05-24].https://arxiv.org/pdf/1810.04805.pdf.[16]AKBIK A,BLYTHE D,VOLLGRAF R.Contextual String Embeddings for Sequence Labeling[C]//Proceedings of the 27th International Conference on Computational Linguistics.Santa Fe,New Mexico,USA:Association for Computational Linguistics,2018:1638-1649.[17]GUO J K,BRACKLE D V,LOFASO N,et al.Extracting mea- ningful entities from human-generated tactical reports[J].Procedia Computer Science,2015,6(1):72-79.[18]SHAN H Y,ZHANG H S,WU Z L.A Military Named Entity Recognition Method Based on CRFs with Small Granularity Strategy[J].Journal of Academy of Armored Force Enginee-ring,2017,31(1):87-88.(in Chinese) 单赫源,张海粟,吴照林.小粒度策略下基于CRFs的军事命名实体识别方法[J].装甲兵工程学院学报,2017,31(1):87-88.[19]FENG Y T,ZHANG H J,HAO W N.Named Entity Recognition for Military Text[J].Computer Science,2015,42(7):15-18,47.(in Chinese) 冯蕴天,张宏军,郝文宁.面向军事文本的命名实体识别[J].计算机科学,2015,42(7):15-18,47.[20]WANG X F,YANG R P,ZHU W.Military Named Entity Reco- gnition Method Based on Deep Learning[J].Journal of Academy of Armored Force Engineering,2018,32(4):94-98.(in Chinese) 王学锋,杨若鹏,朱巍.基于深度学习的军事命名实体识别方法[J].装甲兵工程学院学报,2018,32(4):94-98.[21]MIKOLOV T,YIH W T,ZWEIG G.Linguistic regularities in continuous space word representations[C]//Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Atlanta,Georgia:Association for Computational Linguistics,2013:746-751.[22]LAMPLE G,BALLESTEROS M,SUBRAMANIAN S,et al. Neural architectures for named entity recognition[C]//Procee-dings of NAACL-HLT.San Diego,California,2016:260-270.[23]YANG J,LIANG S L,ZHANG Y.Design challenges and misconceptions in neural sequence labeling[C]//Proceedings of the 27th International Conference on Computational Linguistics (COLING).2018.[24]SENNRICH R,HADDOW B,BIRCH A.Neural machine translation of rare words with subword units[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL 2016).Berlin,Germany,2016.[25]SENNRICH R,HADDOW B.Linguistic Input Features Improve Neural Machine Translation[EB/OL].(2016-06-27).https://arxiv.org/pdf/1606.02892.pdf.[26]GAN L X,WAN C X,LIU D X,et al.Chinese Named Entity Relation Extraction Based on Syntactic and Semantic Features[J].Journal of Computer Research and Development,2016,53(2):284-302.(in Chinese) 甘丽新,万常选,刘德喜,等.基于句法语义特征的中文实体关系抽取[J].计算机研究与发展,2016,53(2):284-302.[27]SRIVASTAVA N,HINTON G,KRIZHEVSKY A,et al.Dropout:A simple way to prevent neural networks from overfitting[J].Journal of Machine Learning Research,2014,15(1):1929-1958. |
[1] | 何鸿君 王明昕. 一种简单,高效的电子词典组织策略 计算机科学, 1996, 23(2): 56-57. |
|