计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 22-29.doi: 10.11896/jsjkx.241000049

• 大语言模型技术研究及应用 • 上一篇    下一篇

基于大语言模型增强的零样本知识抽取方法

皮乾坤, 卢记仓, 祝涛杰, 彭悦翎   

  1. 信息工程大学数据与目标工程学院 郑州 450001
  • 收稿日期:2024-10-11 修回日期:2024-12-13 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 卢记仓(lujicang@sina.com)
  • 作者简介:(piqiankun2000@163.com)
  • 基金资助:
    河南省自然科学基金(222300420590)

Zero-shot Knowledge Extraction Method Based on Large Language Model Enhanced

PI Qiankun, LU Jicang, ZHU Taojie, PENG Yueling   

  1. School of Data and Target Engineering,Information Engineering University,Zhengzhou 450001,China
  • Received:2024-10-11 Revised:2024-12-13 Online:2025-11-15 Published:2025-11-06
  • About author:PI Qiankun,born in 2000,postgra-duate.His main research interests include knowledge graph,stance detection and large language model.
    LU Jicang,born in 1985,Ph.D,associate professor.His main research interests include knowledge reasoning and social network analysis.
  • Supported by:
    Natural Science Foundation of Henan Province(222300420590).

摘要: 知识抽取任务旨在从复杂的信息资源中抽取出结构化的知识。然而,现有的知识抽取研究往往需要依赖大量人工标注数据,这导致了高成本消耗。对此,提出一种基于大语言模型增强的零样本知识抽取方法,旨在不依赖任何人工标注数据,利用大模型强大的语义推理能力,自动化地完成知识抽取任务,降低数据标注成本。具体而言,首先对测试集数据进行格式预处理,并基于此微调跨领域的通用大模型得到数据标注模型,利用该模型对相关文本进行标注,以获得相应的实体及属性推理信息。然后,为这些信息设置新的思维链提示范式,进一步微调特定领域的专业大模型得到知识抽取模型。此外,通过不断增加数据迭代训练,以提升模型性能。最后,利用大模型对测试集的属性信息进行增强,以提高知识抽取模型对文本的理解能力,进而增强知识抽取性能。在多个大模型上的基准测试实验结果进一步证明,提出的零样本知识抽取框架具有更加显著的性能提升。

关键词: 大语言模型, 零样本知识抽取, 数据标注模型, 思维链提示, 知识抽取模型

Abstract: The knowledge extraction task aims to extract structured knowledge from complex information resources.However,existing research on knowledge extraction often relies on a large amount of manually annotated data,leading to high costs.To address this challenge,this paper proposes a zero-shot knowledge extraction method enhanced by large language models,which aims to perform knowledge extraction tasks automatically without relying on any manually annotated data,leveraging the strong semantic reasoning capabilities of large models to reduce data annotation costs.Specifically,it first preprocesses the format of the test set data and fine-tunes a general-purpose large model across domains to obtain a data annotation model.This model is then used to annotate relevant texts to extract corresponding entity and attribute inference information.Next,this paper establishes a new chain of thought prompting paradigm for this information and further fine-tunes a specialized large model for a specific domain to obtain a knowledge extraction model.Additionally,it continuously increases the data and iteratively trains to enhance the model's performance.Finally,it enhances the attribute information of the test set using the large model to improve the knowledge extraction model's understanding of the text,thereby enhancing its extraction performance.Benchmarking experiments on multiple large models further demonstrate that the proposed zero-shot knowledge extraction framework achieves a significant perfor-mance improvement.

Key words: Large language model, Zero-shot knowledge extraction, Data annotation model, Chain of thought, Knowledge extraction model

中图分类号: 

  • TP391
[1]LI B L,CHEN Y Z,YU S W.A Survey of Information Extraction [J].Computer Engineering and Applications,2003(10):1-5,66.
[2]YU J J,WANG X,CHEN W L,et al.Self-training method forlow resource relation extraction [J].Journal of Software,2025,36(4):1620-1636.
[3]WANG R Y,XIANG W,WANG B,et al.A survey of document-level event extraction[J].Journal of Chinese Information Processing,2023,37(6):1-14.
[4]ZHANG Q,SONG Y,GUO P,et al.CRMSP:A Semi-supervised Approach for Key Information Extraction with Class-Rebalan-cing and Merged Semantic Pseudo-Labeling[J].arXiv:2407.15873,2024.
[5]QIN G,LIN N,SHEN M,et al.Global information enhancement and subgraph-level weakly contrastive learning for lightweight weakly supervised document-level event extraction[J].Expert Systems with Applications,2024,240:122516.
[6]LYU Q,ZHANG H,SULEM E,et al.Zero-shot event extraction via transfer learning:Challenges and insights[C]//Procee-dings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:322-332.
[7]SAINZ O,DE LACALLEO L,LABAKA G,et al.Label verbalization and entailment for effective zero-and few-shot relation extraction[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.(ACL 2021),2021:1199-1212.
[8]WANG C,LIU X,CHEN Z,et al.Zero-Shot Information Ex-traction as a Unified Text-to-Triple Translation[C]//Procee-dings of the 2021 Conference on Empirical Methods in Natural Language Processing.2021:1225-1238.
[9]FU Y,LI X,SHANG C,et al.ZoIE:A Zero-Shot Open Information Extraction Model Based on Language Model[C]//26th International Conference on Computer Supported Cooperative Work in Design(CSCWD 2023).IEEE,2023:784-789.
[10]YANG Q,HU Y,CAO R,et al.Zero-shot Key Information Extraction from Mixed-Style Tables:Pre-training on Wikipedia[C]//IEEE International Conference on Data Mining(ICDM 2021).IEEE,2021:1451-1456.
[11]MIN B,ROSS H,SULEM E,et al.Recent advances in natural language processing via large pre-trained language models:A survey[J].ACM Computing Surveys,2023,56(2):1-40.
[12]WANG X,ZHOU W,ZU C,et al.Instructuie:Multi-task in-struction tuning for unified information extraction[J].arXiv:2304.08085,2023.
[13]SAINZ O,GARCÍA-FERRERO I,AGERRI R,et al.Gollie:Annotation guidelines improve zero-shot information-extraction[J].arXiv:2310.03668,2023.
[14]HU D,LIU B,ZHU X,et al.Zero-shot information extraction from radiological reports using ChatGPT[J].International Journal of Medical Informatics,2024,183:105321.
[15]KARTCHNER D,RAMALINGAM S,AL-HUSSAINI I,et al.Zero-Shot Information Extraction for Clinical Meta-Analysis using Large Language Models[C]//The 22nd Workshop on Biomedical Natural Language Processing and BioNLP Shared Tasks.ACL,2023:396-405.
[16]MANGRULKAR S,GUGGER S,DEBUT L,et al.Peft:State-of-the-art parameter-efficient fine-tuning methods[EB/OL].https://github.com/huggingface/peft.
[17]GUI H,QIAO S,ZHANG J,et al.InstructIE:A Bilingual In-struction-based Information Extraction Dataset[J].arXiv:2305.11527,2023.
[18]GUI H,YUAN L,YE H,et al.IEPile:Unearthing Large Scale Schema-Conditioned Information Extraction Corpus[C]//Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics.2024:127-146.
[19]ZHANG N,XU X,TAO L,et al.DeepKE:A Deep LearningBased Knowledge Extraction Toolkit for Knowledge Base Population[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing:System Demonstrations.2022:98-108.
[20]TOUVRON H,MARTIN L,STONE K,et al.LLAMA 2:Open foundation and fine-tuned chat models[J].arXiv:2307.09288,2023.
[21]DUBEY A,JAUHRI A,PANDEY A,et al.The LLAMA 3 herd of models[J].arXiv:2407.21783,2024.
[22]DU Z,QIAN Y,LIU X,et al.GLM:General Language Model Pretraining with Autoregressive Blank Infilling[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.2022:320-335.
[23]YANG A,XIAO B,WANG B,et al.Bachman 2:Open large-scale language models[J].arXiv:2309.10305,2023.
[24]WANG P,ZHANG N,TIAN B,et al.Easyedit:An easy-to-use knowledge editing framework for large language models[J].arXiv:2308.07269,2023.
[25]LOSHCHILOV I,HUTTER F.Decoupled Weight Decay Regularization[C]//International Conference on Learning Representations.2018.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!