计算机科学 ›› 2022, Vol. 49 ›› Issue (6A): 133-139.doi: 10.11896/jsjkx.210400132

• 智能计算 • 上一篇    下一篇

军事指控保障领域命名实体识别语料库的构建

杜晓明, 袁清波, 杨帆, 姚奕, 蒋祥   

  1. 陆军工程大学指挥控制工程学院 南京 210007
  • 出版日期:2022-06-10 发布日期:2022-06-08
  • 通讯作者: 袁清波(12661967@qq.com)
  • 作者简介:(bdar@163.com)
  • 基金资助:
    全军军事类研究生资助课题(JY2019C078)

Construction of Named Entity Recognition Corpus in Field of Military Command and Control Support

DU Xiao-ming, YUAN Qing-bo, YANG Fan, YAO Yi, JIANG Xiang   

  1. College of Command and Control Engineering,Army Engineering University of PLA,Nanjing 210007,China
  • Online:2022-06-10 Published:2022-06-08
  • About author:DU Xiao-ming,born in 1970,Ph.D,professor,Ph.D supervisor.His main research interests include NLP and knowledge graph.
    YUAN Qing-bo,born in 1989,postgra-duate.His main research interests include NLP and knowledge graph.
  • Supported by:
    Military Postgraduate Funding Projects of the PLA(JY2019C078).

摘要: 军事指控保障领域知识图谱的构建是军队信息化装备保障过程中的一个重要研究方向。针对保障领域知识图谱构建中命名实体识别模型缺乏相应基础训练语料库的现状,在分析相关研究现状的基础上,设计并实现了一个基于PyQt5应用程序基本框架的GUI命名实体识别语料库构建系统。首先,简要说明了系统整体架构和语料处理技术流程;其次,详细介绍了系统的数据预处理、标注体系、自动标注、标注分析和编码转换五大功能模块的相关内容,其中自动标注功能模块中的自动标注和自动去重算法的实现是重难点,也是整个系统的核心;最后,通过PyQt5应用程序基本框架和各类功能组件对各功能模块进行了图形用户界面实现。本系统的设计与实现,可以在军队专用电脑上对各种原始装备手册进行自动化处理,快速生成命名实体识别模型训练所需语料库,从而为后续构建相应领域知识图谱提供有效技术支持。

关键词: 军事指控保障, 命名实体识别, 语料库, 知识图谱, 自动标注

Abstract: The construction of the knowledge graph in the field of military command and control support is an important research direction in the process of the military information equipment support.Aiming at the current situation that the named entity re-cognition model lacks the corresponding basic training corpus in the construction of the guarantee domain knowledge graph,based on the analysis of the relevant research status,this paper designs and implements a GUI named entity recognition corpus construction system based on the basic framework of the PyQt5 application program.First,it briefly describes the overall system architecture and corpus processing technical process.Secondly,it introduces the system's data preprocessing,labeling system,automatic labeling,labeling analysis and coding conversion related content in five major functional modules.Among them,the automatic labeling function module is automatic.The implementation of automatic labeling and the realization of automatic de-duplication algorithm is the most important and difficult point,and also is the core of the entire system.Finally,the graphical user interface of each functional module is implemented through the basic framework of the PyQt5 application program and various functional components.The design and implementation of this system can automatically process various original equipment manuals on military computers,and quickly generate the corpus required for named entity recognition model training,so as to provide effective technical support for the subsequent construction of the corresponding domain knowledge graph.

Key words: Automatic annotation, Corpus, Knowledge graph, Military command and control support, Named entity recognition

中图分类号: 

  • TP391
[1] HE J Z.The concepts,reference model of C2 and its value chain analysis[J].Fire Control & Command Control,2019,44(6):1-8.
[2] HANG T T,FENG J,LU J M.Knowledge Graph ConstructionTechniques:Taxonomy,Survey and Future Directions[J].Computer Science,2021,48(2):175-189.
[3] LI Y,HE Y Q,QIAN L H,et al.Chinese Nested Named Entity Recognition Corpus Construction[J].Journal of Chinese Information Processing,2018,32(8):19-26.
[4] YANG J F,GUAN Y,HE B,et al.Corpus construction fornamed entities and entity relations on Chinese electronic medical records[J].Journal of Software,2016,27(11):2725-2746.
[5] ZAN H Y,LIU T,NIU C Y,et al.Construction and Application of Named Entity and Entity Relations Corpus for Pediatric Di-seases[J].Journal of Chinese Information Processing,2020,34(5):19-26.
[6] MO T J,LI R,YANG J X,et al.Construction of named entity corpus for highway bridge inspection domain[J].Journal of Computer Applications,2020,40(S1):103-108.
[7] ZHOU B B,ZHANG H J,ZHANG R,et al.Construction ofMilitary Corpus for Entity Annotation[J].Computer Science,2019,46(S1):540-546.
[8] FENG L L,LI J H,LI P F,et al.Constructing a Technology and Terminology Corpus Oriented National Defense Science[J].Journal of Chinese Information Processing,2020,34(8):41-50.
[9] ZHANG K.Research on semi-automatic tagging of Geographical Entities Information based on incremental learning[D].Nanjing:Nanjing Normal University,2020.
[10] STENETORP P,PYYSALO S,TOPI'C G,et al.BRAT:a Web-based Tool for NLP-Assisted Text Annotation[C]//Procee-dings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics.2012.
[1] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[3] 孔世明, 冯永, 张嘉云.
融合知识图谱的多层次传承影响力计算与泛化研究
Multi-level Inheritance Influence Calculation and Generalization Based on Knowledge Graph
计算机科学, 2022, 49(9): 221-227. https://doi.org/10.11896/jsjkx.210700144
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 秦琪琦, 张月琴, 王润泽, 张泽华.
基于知识图谱的层次粒化推荐方法
Hierarchical Granulation Recommendation Method Based on Knowledge Graph
计算机科学, 2022, 49(8): 64-69. https://doi.org/10.11896/jsjkx.210600111
[6] 王杰, 李晓楠, 李冠宇.
基于自适应注意力机制的知识图谱补全算法
Adaptive Attention-based Knowledge Graph Completion
计算机科学, 2022, 49(7): 204-211. https://doi.org/10.11896/jsjkx.210400129
[7] 马瑞新, 李泽阳, 陈志奎, 赵亮.
知识图谱推理研究综述
Review of Reasoning on Knowledge Graph
计算机科学, 2022, 49(6A): 74-85. https://doi.org/10.11896/jsjkx.210100122
[8] 邓凯, 杨频, 李益洲, 杨星, 曾凡瑞, 张振毓.
一种可快速迁移的领域知识图谱构建方法
Fast and Transmissible Domain Knowledge Graph Construction Method
计算机科学, 2022, 49(6A): 100-108. https://doi.org/10.11896/jsjkx.210900018
[9] 熊中敏, 舒贵文, 郭怀宇.
融合用户偏好的图神经网络推荐模型
Graph Neural Network Recommendation Model Integrating User Preferences
计算机科学, 2022, 49(6): 165-171. https://doi.org/10.11896/jsjkx.210400276
[10] 钟将, 尹红, 张剑.
基于学术知识图谱的辅助创新技术研究
Academic Knowledge Graph-based Research for Auxiliary Innovation Technology
计算机科学, 2022, 49(5): 194-199. https://doi.org/10.11896/jsjkx.210400195
[11] 朱敏, 梁朝晖, 姚林, 王翔坤, 曹梦琦.
学术引用信息可视化方法综述
Survey of Visualization Methods on Academic Citation Information
计算机科学, 2022, 49(4): 88-99. https://doi.org/10.11896/jsjkx.210300219
[12] 梁静茹, 鄂海红, 宋美娜.
基于属性图模型的领域知识图谱构建方法
Method of Domain Knowledge Graph Construction Based on Property Graph Model
计算机科学, 2022, 49(2): 174-181. https://doi.org/10.11896/jsjkx.210500076
[13] 刘凯, 张宏军, 陈飞琼.
基于领域适应嵌入的军事命名实体识别
Name Entity Recognition for Military Based on Domain Adaptive Embedding
计算机科学, 2022, 49(1): 292-297. https://doi.org/10.11896/jsjkx.201100007
[14] 刘妍, 熊德意.
面向小语种机器翻译的平行语料库构建方法
Construction Method of Parallel Corpus for Minority Language Machine Translation
计算机科学, 2022, 49(1): 41-46. https://doi.org/10.11896/jsjkx.210900012
[15] 李嘉明, 赵阔, 屈挺, 刘晓翔.
基于知识图谱的区块链物联网领域研究分析
Research and Analysis of Blockchain Internet of Things Based on Knowledge Graph
计算机科学, 2021, 48(6A): 563-567. https://doi.org/10.11896/jsjkx.200600071
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!