计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240800141-7.doi: 10.11896/jsjkx.240800141

• 大语言模型技术及应用 • 上一篇    下一篇

基于检索增强生成的开放域问答方法研究

白云天, 郝文宁, 靳大尉   

  1. 陆军工程大学指挥控制工程学院 南京 210000
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 郝文宁(hwnbox@aeu.edu.cn)
  • 作者简介:(byt825990802@foxmail.com)
  • 基金资助:
    国防工业技术发展计划(JCKY2020601B018)

Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation

BAI Yuntian, HAO Wenning, JIN Dawei   

  1. College of Command & ControlEngineering,Army Engineering University of PLA,Nanjing 210000,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:BAI Yuntian,born in 2000,postgra-duate.His main research interests include natural language processing and so on.
    HAO Wenning,born in 1971,Ph.D,professor,Ph.D supervisor.His main research interests include data mining and machine learning.
  • Supported by:
    National Defense Industrial Technology Development Program(JCKY2020601B018).

摘要: 大型语言模型在自然语言处理任务中取得显著进展,但其对封装在参数内的知识依赖易引发幻觉现象。为缓解这一问题,检索增强生成技术通过信息检索方法降低错误风险。然而,现有方法检索到的文档往往含有不准确或误导性信息,且在评估文档相关性方面存在判别准确性不足的问题。针对上述挑战,设计了一种简洁高效的方法,通过结合稀疏检索与稠密检索,兼顾词汇重叠的信息与语义相关性。此外,引入排序器对检索到的候选段落进行重排序,在排序器的输入中注入稀疏和稠密检索的分数,进一步优化了段落的排序质量。为验证所提方法的有效性,在SQuAD和HotpotQA数据集上进行实验,并与现有基准方法比较。实验结果表明,所提方法在提升问答性能方面具有显著优势。

关键词: 大型语言模型, 检索增强生成, 信息检索

Abstract: Large language models have made significant progress in natural language processing tasks,but their reliance on knowledge encapsulated within parameters can easily lead to the phenomenon of hallucinations.To mitigate this issue,retrieval-augmented generation techniques reduce the risk of errors through information retrieval methods.However,existing methods often retrieve documents that contain inaccurate or misleading information,and there is a lack of discriminative accuracy in evaluating document relevance.In response to these challenges,this study designs a concise and efficient method that combines sparse retrieval with dense retrieval,taking into account both lexical overlap and semantic relevance.Furthermore,a ranker is introduced to reorder the retrieved candidate paragraphs,with the input to the ranker infused with scores from both sparse and dense retri-eval,further optimizing the quality of paragraph ranking.To validate the effectiveness of this method,experiments were conducted on the SQuAD and HotpotQA datasets,and comparisons were made with existing benchmark methods.The experimental results demonstrate that this method holds a significant advantage in enhancing question-answering performance.

Key words: Large language model, Retrieval-augmented generation, Information retrieval

中图分类号: 

  • TP391
[1]CHEND Q,ADAM F,JASON W,et al.Reading Wikipedia to Answer Open-Doman Questions [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Vancouver,Canada:Association for Computational Linguistics,2017:1870-1879.
[2]GAO S,REN Z C,ZHAO Y H,et al.Product-Aware Answer Generation in E-Commerce Question-Answering [C]//Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining.Melbourne,VIC,Australia:ACM,2019:531-539.
[3]MIN S,BOYD-GRABER J,ALBERTI C,et al.NeurIPS 2020EfficientQA Competition:Systems,Analyses and Lessons Learned [C]//Proceedings of the NeurIPS 2020 Competition and Demonstration Track.PMLR,2021:86-111.
[4]PETRONI F,PIKTUS A,FAN A,et al.KILT:a Benchmark for Knowledge Intensive Language Tasks [C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Online:Association for Computational Linguistics,2021:2523-2544.
[5]CHEN D Q,FISCH A,WESTON J,et al.eading Wikipedia to Answer Open-Domain Questions [C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Vancouver,Canada:Association for Computational Linguistics,2017:1870-1879.
[6]KARPUKHIN V,OGUZ B,MIN S,et al.Dense Passage Retrieval for Open-Domain Question Answering [C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing(EMNLP).Online:Association for Computational Linguistics,2020:6769-6781.
[7]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive NLP tasks [C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.Red Hook,NY,USA:Curran Associates Inc.,2020:793-809.
[8]IZACARD G,GRAVE E.Leveraging Passage Retrieval withGenerative Models for Open Domain Question Answering [C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.Online:Association for Computational Linguistics,2021:874-880.
[9]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners [C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.Red Hook,NY,USA:Curran Associates Inc.,2020:159-183.
[10]ZHAO W X,ZHOU K,LI J,et al.A Survey of Large Language Models [J].arXiv:2303.18223,2023.
[11]MALLEN A,ASAI A,ZHONG V,et al.When Not to Trust Language Models:Investigating Effectiveness of Parametric and Non-Parametric Memories [C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Toronto,Canada:Association for Computational Linguistics,2023:9802-9822.
[12]REN R Y,WANG Y H,QU Y Q,etal.Investigating the Factual Knowledge Boundary of Large Language Models with Retrieval Augmentation [J].arXiv:2307.11019,2023.
[13]XU J,SZLAM A,WESTON J.Beyond Goldfish Memory:Long-Term Open-Domain Conversation [C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Dublin,Ireland:Association for Computational Linguistics,2022:5180-5197.
[14]MOJTABA K,KURT S,JASON W.Internet-Augmented Dialogue Generation [C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Dublin,Ireland:Association for Computational Linguistics,2022:8460-8478.
[15]KELVIN G,KENTON L,ZORA T,et al.REALM:Retrieval-Augmented Language Model Pre-Training[C]//International Conference on Machine Learning.2020:3929-3938.
[16]LIUNF,LINK,HEWITT J,et al.Lost in the Middle:How Language Models Use Long Contexts[J].Transactions of the Association for Computational Linguistics,2024,12:157-173.
[17]AKARI A,ZEQIU W,WANG Y Z,et al.Self-RAG:Learning to Retrieve,Generate,and Critique through Self-Reflection [J].arXiv:2310.11511,2023.
[18]LUO H Y,CHUANG Y S,GONG Y,et al.Search Augmented Instruction Learning [C]//Findings of the Association for Computational Linguistics(EMNLP 2023).Singapore:Association for Computational Linguistics,2023:3717-3729.
[19]WANG Y,LI P,SUN M,et al.Self-Knowledge Guided RetrievalAugmentation for Large Language Models [C]//Findings of the Association for Computational Linguistics(EMNLP 2023).Singapore:Association for Computational Linguistics,2023:10303-10315.
[20]FREDA S,CHEN X Y,KANISHKA M,et al.Large Language Models Can Be Easily Distracted by Irrelevant Context [C]//Proceedings of the 40th International Conference on Machine Learning.PMLR,2023:31210-31227.
[21]CHRISTOPHER S,ZHONG Z X,LEE J,et al.Simple Entity-Centric Questions Challenge Dense Retrievers [C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Online and Punta Cana,Dominican Republic:Association for Computational Linguistics,2021:6138-6148.
[22]QI P,LEE H,SIDO T,et al.Answering Open-Domain Questions of Varying Reasoning Steps from Text [C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing.Online and Punta Cana,Dominican Republic:Association for Computational Linguistics,2021:3599-3614.
[23]LIU Y,HASHIMOTO K,ZHOU Y B,et al.Dense Hierarchical Retrieval for Open-domain Question Answering [C]//Findings of the Association for Computational Linguistics:EMNLP 2021.Punta Cana,Dominican Republic:Association for Computational Linguistics,2021:188-200.
[24]ZHU Y T,YUAN H Y,WANG S T,et al.Large LanguageModels for Information Retrieval:A Survey [J].arXiv:2308.07107,2023.
[25]ZHUANG H L,QIN Z,HUI K,et al.Beyond Yes and No:Improving Zero-Shot LLM Rankers via Scoring Fine-Grained Relevance Labels [C]//Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(Volume 2:Short Papers).Mexico City,Mexico:Association for Computational Linguistics,2024:358-370.
[26]MA X G,ZHANG X Y,PRADEEP R,et al.Zero-Shot Listwise Document Reranking with a Large Language Model [J].arXiv:2305.02156,2023.
[27]NOGUEIRA R,YANG W,CHO K,et al.Multi-Stage Document Ranking with BERT [J].arXiv:1910.14424,2019.
[28]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding [C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 1(Long and Short Papers).Minneapolis,Minnesota:Association for Computational Linguistics,2019:4171-4186.
[29]BOUALILI L,MORENO J G,BOUGHANEM M.Highlightingexact matching via marking strategies for ad hoc document ranking with pretrained contextualized language models [C]//Actes de la 18e Conférence en Recherche d’Information et Applications(CORIA).Paris,France:ATALA,2023:201-201.
[30]GAO L,DAI Z,CALLAN J.Rethink Training of BERTRerankers in Multi-stage Retrieval Pipeline [C]//Advances in Information Retrieval:43rd European Conference on IR Research,ECIR 2021,Proceedings,Part II.Berlin,Heidelberg:Springer-Verlag,2021:280-286.
[31]YANG Z,QI P,ZHANG S,et al.HotpotQA:A Dataset for Diverse,Explainable Multi-hop Question Answering [C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.Brussels,Belgium:Association for Computational Linguistics,2018:2369-2380.
[32]RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:00 000+Questions for Machine Comprehension of Text [C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.Austin,Texas:Association for Computational Linguistics,2016:2383-2392.
[33]LIN J,MA X,LIN S,et al.Pyserini:A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations [C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.Virtual Event,Canada:Association for Computing Machinery,2021:2356-2362.
[34]CHEN J,XIAO S,ZHANG P,et al.M3-Embedding:Multi-Linguality,Multi-Functionality,Multi-Granularity Text Embeddings Through Self-Knowledge Distillation [C]//Findings of the Association for Computational Linguistics ACL 2024.Bangkok,Thailand and virtual meeting:Association for Computational Linguistics,2024:2318-2335.
[35]ALEX M,AKARI A,VICTOR Z,et al.When Not to Trust Language Models:Investigating Effectiveness of Parametric and Non-Parametric Memories [C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics(Volume 1:Long Papers).Toronto,Canada:Association for Computational Linguistics,2023:9802-9822.
[36]MA X,GONG Y,HE P,et al.Query Rewriting in Retrieval-Augmented Large Language Models [C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.Singapore:Association for Computational Linguistics,2023:5303-5315.
[37]KIM J,NAM J,MO S,et al.SuRe:Summarizing Retrievalsusing Answer Candidates for Open-domain QA of LLMs [C]//The Twelfth International Conference on Learning Representations.2024.
[38]SHI W,MIN S,YASUNAGA M,et al.REPLUG:Retrieval-Augmented Black-Box Language Models [C]//Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies(Volume 1:Long Papers).Mexico City,Mexico:Association for Computational Linguistics,2024:8371-8384.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!