Computer Science ›› 2019, Vol. 46 ›› Issue (6A): 540-546.

• Interdiscipline & Application • Previous Articles     Next Articles

Construction of Military Corpus for Entity Annotation

ZHOU Bin-bin, ZHANG Hong-jun, ZHANG Rui, FENG Yun-tian, XU You-wei   

  1. School of Command and Control Engineering,Arm Engineering University of PLA,Nanjing 210007,China
  • Online:2019-06-14 Published:2019-07-02

Abstract: The key to build military corpus are the identification and the marking of military corpus.For the entities of military corpus,this paper put forward a set of unified army language part-of-speech tags specification and military corpus annotation specifications,and designed a kind of automatic extension of military corpora based on the military language dictionary entity framework feature extraction.With the help of high precision classifier,the framework selects and extracts the basic features,combined with the typical features of the language set,builds the feature space.Based on the language dictionary correction for military corpora entity recognition,according to the specified annotation standard and specification of morphological marker military annotation corpus entity,the framework builds a large-scale high-quality military corpus.Experiments show that the framework can better complete corpus entity recognition and corpus annotation of the work,to do the construction of military corpus work and to recognize its function and the application prospect of widely in the military.

Key words: Feature extraction, Military corpus, Military entity’s annotation, Military speech tagging

CLC Number: 

  • TP391
[7]XIA F,YETISGEN-YILDIZ M.Clinical corpus annotation: Challenges and strategies[C]∥Proceedings of the 3rd Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM 2012) of the International Conference on Language Resources and Evaluation (LREC).2012:32-39.
[8]SNOW R,O’CONNOR B,JURAFSKY D,et al.Cheap and fast—But is it good? Evaluating non-expert annotations for natural language tasks[C]∥Proceedings of the Conference on Empirical Methods in Natural Language Processing.Stroudsburg.Association for Computational Linguistics,2008:254-263.
[9]ZHOU J,LI B C,CHEN G.Automatically building large-scale named entity recognition corpora from Chinese Wikipedia[J].Frontiers of Information Technology &Electronic Engineering,2015,16(11):940-957.
[10]NADEAU D,SEKINE S.A survey of named entity recognition and classification[J].Lingvisticae Investigations,2007,30(1):3-26.
[11]XIE L,ZHENG Y,LIU Z,et al.Extracting Chinese abbrevia-tion-definition pairs from anchor texts[C]∥International Conference on Machine Learning and Cybernetics.IEEE,2011:1485-1491.
[13]CHANG J S,TENG W L.Mining atomic Chinese abbreviations with a probabilistic single character recovery model[J].Language Resources and Evaluation,2007,40(3-4):367-374.
[14]CHANG J S,LAI Y T.A Preliminary Study on Probabilistic Models for Chinese Abbreviations[C]∥Proceedings of the Third Sighan Workshop on Chinese Language Learning.2004:9-16.
[1] ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[2] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[3] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[4] LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.
[5] GAO Yuan-hao, LUO Xiao-qing, ZHANG Zhan-cheng. Infrared and Visible Image Fusion Based on Feature Separation [J]. Computer Science, 2022, 49(5): 58-63.
[6] ZUO Jie-ge, LIU Xiao-ming, CAI Bing. Outdoor Image Weather Recognition Based on Image Blocks and Feature Fusion [J]. Computer Science, 2022, 49(3): 197-203.
[7] REN Shou-peng, LI Jin, WANG Jing-ru, YUE Kun. Ensemble Regression Decision Trees-based lncRNA-disease Association Prediction [J]. Computer Science, 2022, 49(2): 265-271.
[8] ZHANG Shi-peng, LI Yong-zhong. Intrusion Detection Method Based on Denoising Autoencoder and Three-way Decisions [J]. Computer Science, 2021, 48(9): 345-351.
[9] FENG Xia, HU Zhi-yi, LIU Cai-hua. Survey of Research Progress on Cross-modal Retrieval [J]. Computer Science, 2021, 48(8): 13-23.
[10] ZHANG Li-qian, LI Meng-hang, GAO Shan-shan, ZHANG Cai-ming. Summary of Computer-assisted Tongue Diagnosis Solutions for Key Problems [J]. Computer Science, 2021, 48(7): 256-269.
[11] BAO Yu-xuan, LU Tian-liang, DU Yan-hui, SHI Da. Deepfake Videos Detection Method Based on i_ResNet34 Model and Data Augmentation [J]. Computer Science, 2021, 48(7): 77-85.
[12] CHEN Yang, WANG Jin-liang, XIA Wei, YANG Hao, ZHU Run, XI Xue-feng. Footprint Image Clustering Method Based on Automatic Feature Extraction [J]. Computer Science, 2021, 48(6A): 255-259.
[13] LI Na-na, WANG Yong, ZHOU Lin, ZOU Chun-ming, TIAN Ying-jie, GUO Nai-wang. DDoS Attack Random Forest Detection Method Based on Secondary Screening of Feature Importance [J]. Computer Science, 2021, 48(6A): 464-467.
[14] LEI Jian-mei, ZENG Ling-qiu, MU Jie, CHEN Li-dong, WANG Cong, CHAI Yong. Reverse Diagnostic Method Based on Vehicle EMC Standard Test and Machine Learning [J]. Computer Science, 2021, 48(6): 190-195.
[15] LI Meng-he, XU Hong-ji, SHI Lei-xin, ZHAO Wen-jie, LI Juan. Multi-person Activity Recognition Based on Bone Keypoints Detection [J]. Computer Science, 2021, 48(4): 138-143.
Full text



No Suggested Reading articles found!