计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 13-21.doi: 10.11896/jsjkx.241200198

• 大语言模型技术研究及应用 • 上一篇    下一篇

基于组合上下文提示的大型语言模型领域知识问答研究

方全1, 张金龙2, 王冰倩1, 胡骏3   

  1. 1 北京邮电大学人工智能学院 北京 100876
    2 郑州大学河南先进技术研究院 郑州 450002
    3 新加坡国立大学计算机学院 新加坡 117417
  • 收稿日期:2024-12-30 修回日期:2025-04-13 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 方全(qfang@bupt.edu.cn)
  • 基金资助:
    北京市自然科学基金(JQ24019);空地一体新航行系统技术全国重点实验室开放课题(2024B31);国家自然科学基金(62036012)

Research on Domain Knowledge Question Answering via Large Language Models withCompositional Context Prompting

FANG Quan1, ZHANG Jinlong2, WANG Bingqian1, HU Jun3   

  1. 1 School of Artificial Intelligence,Beijing University of Posts and Telecommunications,Beijing 100876,China
    2 Henan Institute of Advanced Technology,Zhengzhou University,Zhengzhou 450002,China
    3 School of Computing,National University of Singapore,Singapore 117417,Singapore
  • Received:2024-12-30 Revised:2025-04-13 Online:2025-11-15 Published:2025-11-06
  • About author:FANG Quan,born in 1988,professor.His main research interest is multimedia knowledge computing.
  • Supported by:
    Beijing Natural Science Foundation(JQ24019),Open Project Program of State Key Laboratory of CNS/ATM(2024B31) and National Natural Science Foundation of China(62036012).

摘要: 近年来,大型语言模型的快速发展引起了社会各界的广泛关注。大型语言模型虽然天然适应各种自然语言处理任务,但是在特定领域的问答任务中,由于缺少针对垂直领域的训练,生成答案的可靠性和适用性往往不尽如人意。为提升领域知识问答系统的性能,提出了一种新的基于组合上下文提示的大型语言模型领域知识问答方法。组合上下文提示包括了领域知识上下文和问答示例上下文两部分。领域知识上下文由采用基于对比学习的密集检索器从领域知识库中检索得到,能够增强大型语言模型的领域专业知识处理能力。问答示例上下文则通过语义相似检索从训练集中取得,能够提升大型语言模型对问题意图的理解能力。最后,将得到的组合上下文提示输入经过领域知识微调后的大型语言模型中,生成最终的领域答案。通过充分的实验和与基线模型的综合比较证明,所提方法在BERTScore指标上精确度和召回率分别比ChatGPT提高了15.91%和16.14%,F1 Score比ChatGPT提高了15.87%。

关键词: 大语言模型, 领域知识问答, 组合上下文提示, 对比学习, 检索

Abstract: In recent years,the rapid development of large language models has garnered widespread attention across various sectors.While these models naturally excel at various natural language processing tasks,their performance in domain-specific question answering tasks often falls short due to a lack of specialized training in vertical domains,leading to unreliable and less applicable answers.To improve the performance of domain knowledge question answering systems,this paper proposes a novel approach based on compositional context prompting for large language models.Compositional context prompting consists of domain knowledge context and question-answer example context.The domain knowledge context is retrieved from the domain knowledge base using a contrastive learning based dense retriever,which can enhance the domain expertise processing ability of large language models.The question-answer example context is obtained through semantic similarity retrieval from the training set,which improves the large language model's understanding of question intent.Finally,the obtained composite context prompts are inputted into the large-scale language model fine-tuned with domain knowledge to generate the final domain answers.Through extensive experiments and comprehensive comparisons with baseline models,the proposed method achieves an improvement of 15.91% in precision and 16.14% in recall on the BERTScore metric compared to ChatGPT,with an F1 Score improvement of 15.87%.

Key words: Large language models, Domain knowledge question-answering, Compositional context prompting, Contrastive lear-ning, Retrieval

中图分类号: 

  • TP391
[1]LIN Z Y,YANG D Q,WANG T J,et al.Keyword Query Based on Relational Databases[J].Journal of Software,2010,21(10):2454-2476.
[2]SU Y,YANG S,SUN H,et al.Exploiting relevance feedback in knowledge graph search[C]//Proceedings of the 21th ACM Sigkdd International Conference on Knowledge Discovery and Data Mining.2015:1135-1144.
[3]GREEN J R B F,WOLF A K,CHOMSKY C,et al.Baseball:an automatic question-answerer[C]// Western Joint IRE-AIEE-ACM Computer Conference.1961:219-224.
[4]WOODS W A.Progress in natural language understanding:an application to lunar geology[C]// National Computer Confe-rence and Exposition.1973:441-450.
[5]PEI Z,ZHANG J,XIONG W,et al.A general framework forChinese domain knowledge graph question answering based on TransE[C]//Journal of Physics:Conference Series.IOP Publishing,2020:012136.
[6]CHANG Y,WANG X,WANG J,et al.A survey on evaluation of large language models[J].ACM Transactions on Intelligent Systems and Technology, 2024,15(3):39.1-39.45.
[7]WEN S,QIAN L,HU M D,et al.A Review of Research Progress on Question Answering Technology Based on Large Language Models[J].Data Analysis and Knowledge Discovery,2024,8(6):16-29.
[8]MARX E,USBECK R,NGOMO A C N,et al.Towards an open question answering architecture[C]//Proceedings of the 10th International Conference on Semantic Systems.2014:57-60.
[9]AUER S,BIZER C,KOBILAROV G,et al.Dbpedia:A nucleus for a web of open data[C]//International Semantic Web Confe-rence.Berlin:Springer,2007:722-735.
[10]SINGH K,RADHAKRISHNA A S,BOTH A,et al.Why reinvent the wheel:Let's build question answering systems together[C]//Proceedings of the 2018 World Wide Web Conference.2018:1247-1256.
[11]DIEFENBACH D,MIGLIATTI P H,QAWASMEH O,et al.QAnswer:A Question Answering prototype bridging the gap between a considerable part of the LOD cloud and end-users[C]//The World Wide Web Conference.2019:3507-3510.
[12]DIEFENBACH D,LOPEZ V,SINGH K,et al.Core techniques of question answering systems over knowledge bases:a survey[J].Knowledge and Information Systems,2018,55:529-569.
[13]ZOU L,HUANG R,WANG H,et al.Natural language question answering over RDF:a graph data driven approach[C]//Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data.2014:313-324.
[14]YAHYA M,BERBERICH K,ELBASSUONI S,et al.Natural language questions for the web of data[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.2012:379-390.
[15]SHEKARPOUR S,MARX E,NGOMO A C N,et al.Sina:Semantic interpretation of user queries for question answering on interlinked data[J].Journal of Web Semantics,2015,30:39-51.
[16]BROWN T,MANN B,RUDER N,et al.Language models are few-shot learners[C]//Advances in Neural Information Processing Systems.2020:1877-1901.
[17]XIE S M,RAGHUNATHAN A,LIANG P,et al.An explanation of in-context learning as implicit bayesian inference[J].arXiv:2111.02080,2021.
[18]WIES N,LEVINE Y,SHASHUA A.The learnability of in-context learning[C]//Advances in Neural Information Processing Systems.2024.
[19]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive nlp tasks[C]//Advances in Neural Information Processing Systems.2020:9459-9474.
[20]LUO H,TANG Z,PENG S,et al.Chatkbqa:A generate-then-retrieve framework for knowledge base question answering with fine-tuned large language models[J].arXiv:2310.08975,2023.
[21]LIN X V,CHEN X,CHEN M,et al.Ra-dit:Retrieval-augmented dual instruction tuning[J].arXiv:2310.01352,2023.
[22]LIU X,LAI H,YU H,et al.Webglm:Towards an efficient web-enhanced question answering system with human preferences[C]//Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2023:4549-4560.
[23]MA H,CAO J,FANG Y,et al.Retrieval-based gradient boosting decision trees for disease risk assessment[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2022:3468-3476.
[24]WIRATUNGA N,ABEYRANTNE R,JAYAWARDENA L,et al.CBR-RAG:case-based reasoning for retrieval augmented generation in LLMs for legal question answering[C]//International Conference on Case-Based Reasoning.Cham:Springer,2024:445-460.
[25]SARMAH B,MEHTA D,HALL B,et al.Hybridrag:Integra-ting knowledge graphs and vector retrieval augmented generation for efficient information extraction[C]//Proceedings of the 5th ACM International Conference on AI in Finance.2024:608-616.
[26]HU E J,SHEN Y,WALLIS P,et al.Lora:Low-rank adaptation of large language models[J].arXiv:2106.09685,2021.
[27]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017.
[28]XU M.Text2vec:Text to vector toolkit(Version 1.1.2) [EB/OL].https://github.com/shibing624/text2vec.
[29]ZHAO H,YUAN S,LENG J,et al.Calculating Question Similarity is Enough:A New Method for KBQA Tasks[J].arXiv:2111.07658,2021.
[30]LI Y,SOSEA T,SAWANT A,et al.P-stance:A large datasetfor stance detection in political domain[C]//Findings of the Association for Computational Linguistics:ACL-IJCNLP 2021.2021:2355-2365.
[31]LlamaFamily.Llama Chinese[EB/OL].[2024-02-14].https://github.com/LlamaFamily/Llama-Chinese/tree/main.
[32]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[J].arXiv:2302.13971,2023.
[33]CHIANG W L,LI Z,LIN Z,et al.Vicuna:An open-source chatbot impressing gpt-4 with 90%* chatgpt quality[EB/OL].https://vicuna.lmsys.org.
[34]YANG A,XIAO B,WANG B,et al.Baichuan 2:Open large-scale language models[J].arXiv:2309.10305,2023.
[35]CUI Y,YANG Z,YAO X.Efficient and effective text encoding for chinese llama and alpaca[J].arXiv:2304.08177,2023.
[36]WU T,HE S,LIU J,et al.A brief overview of ChatGPT:The history,status quo and potential future development[J].IEEE/CAA Journal of Automatica Sinica,2023,10(5):1122-1136.
[37]RADFORD A,NARASIMHAN K,SALIMANS T,et al.Improving language understanding by generative pre-training[EB/OL].https://gwern.net/doc/www/s3-us-west-2.amazonaws.com/d73fdc5ffa8627bce44dcda2fc012da638ffb158.pdf.
[38]ZENG A,LIU X,DU Z,et al.Glm-130b:An open bilingual pre-trained model[J].arXiv:2210.02414,2022.
[39]ZHANG Y,XU G,FU X,et al.Adversarial training improved multi-path multi-scale relation detector for knowledge base question answering[J].IEEE Access,2020,8:63310-63319.
[40]ZHOU G,XIE Z,YU Z,et al.DFM:A parameter-shared deep fused model for knowledge base question answering[J].Information Sciences,2021,547:103-118.
[41]ZHAO H,YUAN S,LENG J,et al.Calculating Question Similarity is Enough:A New Method for KBQA Tasks[J].arXiv:2111.07658,2021.
[42]YANG E,HAO F,SHANG J,et al.BT-CKBQA:An efficientapproach for Chinese knowledge base question answering[J].Data & Knowledge Engineering,2023,147:102204.
[43]HAO H,SUN X,WEI J.A semantic union model for open domain Chinese knowledge base question answering[J].Scientific Reports,2023,13(1):11903.
[44]PRASANNA K S L,LOKESH S,CHANDRAMOULI G,et al.BERT-QA:Empowering Intelligent Question Answering with NLP and Entity Recognition[C]//2024 3rd International Conference on Applied Artificial Intelligence and Computing(ICAAIC).IEEE,2024:1006-1010.
[45]AUGENSTEIN I,ROCKTASCHEL T,VLACHOS A,et al.Stance Detection with Bidirectional Conditional Encoding[C]//EMNLP 2016.The Association for Computational Linguistics.2016:876-885.
[46]YOON K.Convolutional neural networks for sentence classification[C]//EMNLP.2014:1746-1751.
[47]DU J,XU R,HE Y,et al.Stance classification with target-specific neural attention networks[C]//26th International Joint Conference on Artificial Intelligence(IJCAI 2017).International Joint Conferences on Artificial Intelligence,2017:3988-3994.
[48]SHIN T,RAZEGHI Y,LOGAN IV R L,et al.Autoprompt:Eliciting knowledge from language models with automatically generated prompts[J].arXiv:2010.15980,2020.
[49]XUE W,LI T.Aspect based sentiment analysis with gated con-volutional networks[J].arXiv:1805.07043,2018.
[50]HUANG B,CARLEY K M.Parameterized Convolutional Neural Networks for Aspect Level Sentiment Classification[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:1091-1096.
[51]KENTON J D M W C,TOUTANOVA L K.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT.2019:2.
[52]DU Z,QIAN Y,LIU X,et al.Glm:General language model pretraining with autoregressive blank infilling[J].arXiv:2103.10360,2021.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!