计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 22-29.doi: 10.11896/jsjkx.241000049
皮乾坤, 卢记仓, 祝涛杰, 彭悦翎
PI Qiankun, LU Jicang, ZHU Taojie, PENG Yueling
摘要: 知识抽取任务旨在从复杂的信息资源中抽取出结构化的知识。然而,现有的知识抽取研究往往需要依赖大量人工标注数据,这导致了高成本消耗。对此,提出一种基于大语言模型增强的零样本知识抽取方法,旨在不依赖任何人工标注数据,利用大模型强大的语义推理能力,自动化地完成知识抽取任务,降低数据标注成本。具体而言,首先对测试集数据进行格式预处理,并基于此微调跨领域的通用大模型得到数据标注模型,利用该模型对相关文本进行标注,以获得相应的实体及属性推理信息。然后,为这些信息设置新的思维链提示范式,进一步微调特定领域的专业大模型得到知识抽取模型。此外,通过不断增加数据迭代训练,以提升模型性能。最后,利用大模型对测试集的属性信息进行增强,以提高知识抽取模型对文本的理解能力,进而增强知识抽取性能。在多个大模型上的基准测试实验结果进一步证明,提出的零样本知识抽取框架具有更加显著的性能提升。
中图分类号:
| [1]LI B L,CHEN Y Z,YU S W.A Survey of Information Extraction [J].Computer Engineering and Applications,2003(10):1-5,66. [2]YU J J,WANG X,CHEN W L,et al.Self-training method forlow resource relation extraction [J].Journal of Software,2025,36(4):1620-1636. [3]WANG R Y,XIANG W,WANG B,et al.A survey of document-level event extraction[J].Journal of Chinese Information Processing,2023,37(6):1-14. [4]ZHANG Q,SONG Y,GUO P,et al.CRMSP:A Semi-supervised Approach for Key Information Extraction with Class-Rebalan-cing and Merged Semantic Pseudo-Labeling[J].arXiv:2407.15873,2024. [5]QIN G,LIN N,SHEN M,et al.Global information enhancement and subgraph-level weakly contrastive learning for lightweight weakly supervised document-level event extraction[J].Expert Systems with Applications,2024,240:122516. [6]LYU Q,ZHANG H,SULEM E,et al.Zero-shot event extraction via transfer learning:Challenges and insights[C]//Procee-dings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:322-332. [7]SAINZ O,DE LACALLEO L,LABAKA G,et al.Label verbalization and entailment for effective zero-and few-shot relation extraction[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.(ACL 2021),2021:1199-1212. [8]WANG C,LIU X,CHEN Z,et al.Zero-Shot Information Ex-traction as a Unified Text-to-Triple Translation[C]//Procee-dings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:1225-1238. [9]FU Y,LI X,SHANG C,et al.ZoIE:A Zero-Shot Open Information Extraction Model Based on Language Model[C]//26th International Conference on Computer Supported Cooperative Work in Design(CSCWD 2023).IEEE,2023:784-789. [10]YANG Q,HU Y,CAO R,et al.Zero-shot Key Information Extraction from Mixed-Style Tables:Pre-training on Wikipedia[C]//IEEE International Conference on Data Mining(ICDM 2021).IEEE,2021:1451-1456. [11]MIN B,ROSS H,SULEM E,et al.Recent advances in natural language processing via large pre-trained language models:A survey[J].ACM Computing Surveys,2023,56(2):1-40. [12]WANG X,ZHOU W,ZU C,et al.Instructuie:Multi-task in-struction tuning for unified information extraction[J].arXiv:2304.08085,2023. [13]SAINZ O,GARCÍA-FERRERO I,AGERRI R,et al.Gollie:Annotation guidelines improve zero-shot information-extraction[J].arXiv:2310.03668,2023. [14]HU D,LIU B,ZHU X,et al.Zero-shot information extraction from radiological reports using ChatGPT[J].International Journal of Medical Informatics,2024,183:105321. [15]KARTCHNER D,RAMALINGAM S,AL-HUSSAINI I,et al.Zero-Shot Information Extraction for Clinical Meta-Analysis using Large Language Models[C]//The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks.ACL,2023:396-405. [16]MANGRULKAR S,GUGGER S,DEBUT L,et al.Peft:State-of-the-art parameter-efficient fine-tuning methods[EB/OL].https://github.com/huggingface/peft. [17]GUI H,QIAO S,ZHANG J,et al.InstructIE:A Bilingual In-struction-based Information Extraction Dataset[J].arXiv:2305.11527,2023. [18]GUI H,YUAN L,YE H,et al.IEPile:Unearthing Large Scale Schema-Conditioned Information Extraction Corpus[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics.2024:127-146. [19]ZHANG N,XU X,TAO L,et al.DeepKE:A Deep LearningBased Knowledge Extraction Toolkit for Knowledge Base Population[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.2022:98-108. [20]TOUVRON H,MARTIN L,STONE K,et al.LLAMA 2:Open foundation and fine-tuned chat models[J].arXiv:2307.09288,2023. [21]DUBEY A,JAUHRI A,PANDEY A,et al.The LLAMA 3 herd of models[J].arXiv:2407.21783,2024. [22]DU Z,QIAN Y,LIU X,et al.GLM:General Language Model Pretraining with Autoregressive Blank Infilling[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:320-335. [23]YANG A,XIAO B,WANG B,et al.Bachman 2:Open large-scale language models[J].arXiv:2309.10305,2023. [24]WANG P,ZHANG N,TIAN B,et al.Easyedit:An easy-to-use knowledge editing framework for large language models[J].arXiv:2308.07269,2023. [25]LOSHCHILOV I,HUTTER F.Decoupled Weight Decay Regularization[C]//International Conference on Learning Representations.2018. |
|
||