计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230800150-7.doi: 10.11896/jsjkx.230800150
赖欣, 李思宁, 梁昌盛, 张恒嫣
LAI Xin, LI Sining, LIANG Changsheng, ZHANG Hengyan
摘要: 航空资料汇编是国际民航组织推荐的呈现各国航空信息的主要载体,其中以表格数据形式汇总了大量航空数据与航空运行限制信息。为实现航空汇编资料的智能查询,以及对航空资料汇编中静态数据的挖掘与利用,需要对航空汇编资料中的表格信息予以特征提取与结构化处理。将航空资料汇编中表格信息作为研究对象,提出了一种基于本体驱动的航空情报表格信息结构化抽取方法。首先构建航空情报领域信息的本体框架,实现对领域知识统一规范的描述;其次,利用Document AI对表格文档的布局结构进行研究与预处理,并利用随机森林算法与条件随机场模型进行特征实体提取验证与分析。实验结果表明,所提方法能够有效提取航空情报表格中的特征实体,为航空情报领域静态数据深入挖掘提供参考。
中图分类号:
[1]CUI L,XU Y H,LV T C,et al.Document AI:Benchmarks,Models and Applications[J].Journal of Chinese Information Processing,2022,36(6):1-19. [2]SUN S D.Research on semantics knowledge organization of historical newspaper resources in digital humanities Research on Sem[D].Jilin:Jilin University,2022. [3]ZHANG Y T,LI Q Y,LIU S K.Tabular subordination relation extraction based on graph convolutional network[J].Journal of Beijing University of Aeronautics and Astronautics,2024,50(4):1308-1315. [4]TANG R,DENG J X,YE Z X,et al.Survey of Table Extraction in PDF Documents[J].Computer Applications and Software,2021,38(7):1-7,22. [5]SHEN Y F.Construction and Intelligent Application of PublicSecurity Knowledge Graph Model Based on Multi-source Hete-rogeneous Data[J].Police Science Research,2021(5):79-89. [6]YU F.Methodothology and empirical research on Domain Ontology-A case of Geomatics[D].Wuhan:Wuhan University,2013. [7]LI A H,XU Y Z,CHI Y X.Review of Ontology Construction and Applications[J/OL].Information Studies:Theory & Application:1-9[2023-08-09]. [8]WANG Y L,ZOU J F,WANG K,et al.Injection MoldingKnowledge Graph Based on Ontology Guidance and its Application to Quality Diagnosis[J].Journal of Electronics & Information Technology,2022,44(5):1521-1529. [9]TANG A M,ZHEN Q,FAN J.Thesaurus-based Approach to Build Domain Ontology[J].Data Analysis and Knowledge Discovery,2005(4):1-5. [10]ZHOU Y W,YANG C H,WANG H Y.Ontology construction of military field[J].Computer Era,2022(9):96-99. [11]DING S C,FU Z.Research on Semi-automatic Construction of Domain Ontology Based on Space Thesaurus[J].Information Studies:Theory & Application,2011,34(11):113-116. [12]SUN X,REN X Y,ZHENG H C,et al.Domain Named Entity Recognition Method Based on Parameter Transfer Learning[J].Technology Intelligence Engineering,2022,8(3):13-27. [13]YANG X W,YUMER E,ASENTE P,et al.Learning to extract semantic structure from documents using multimodal fully con-volutional neural networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:4342-4351. [14]PRASAD D,GADPAL A,KAPADNI K,et al.Cascadetabnet:An approach for end to end table detection and structure recognition from imagebased documents[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:572-573. [15]FENG Y T,ZHANG H J,HAO W N.Named Entity Recognition for Military Text[J].Computer Science,2015,42(7):15-18,47. [16]GAO X,TANG J Q,ZHU J W,et al.Study on Named Entity Recognition Method Based on Knowledge Graph Enhancement[J].Computer Science,2023,50(S1):112-117. [17]KRUENGKRAI C,NGUYENT H,ALJUNIED S M,et al.Improving LowResource Named Entity Recognitionusing Joint Sentenceand Token Labeling[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5898-5905. [18]ZHU R,YANG L C,DING W X,et al.Study on Named Entity Recognition Method Based on Knowledge Graph Enhancement[J].Journal of Central Journal of Central China Normal University(Natural Sciences),2018,52(3):316-321. [19]LIU P F,QIAN L,ZHAO X W,et al.Continual learning framework of named entity recognition in aviation assembly domain[J].Journal of Zhejiang University(Engineering Science),2023,57(6):1186-1194,1266. [20]LIN B,WU S B,ZOU Y J,et al.Individual Travel Behavior Prediction of Hong Kong-Zhuhai-Macao Bridge Based on Combinatio on of BLSMOTE Algorithm and Random Forest Model[J].Traffic & Transportation,2023,39(2):37-43. [21]GAO X,WANG S,ZHU J W,et al.Overview of Named Entity Recognition Tasks[J].Computer Science,2023,50(S1):26-33. [22]YANG Z W.Research on Named Entity Recognition Methods for Unstructured Text[D].Jilin:Jilin University,2023. [23]XU M X.Application of named entity recognition technology in epidemiological investigation[D].Guizhou:Guizhou Normal University,2022. [24]KRUENGKRAI C,NGUYENT H,ALJUNIED S M,et al.Improving LowResource Named Entity Recognitionusing Joint Sentenceand Token Labeling[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5898-5905. [25]WANG J,SHOU L,CHEN K,et al.Pyramid:A layered model for nested named entity recognition[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:5918-5928. |
|