计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 324-329.doi: 10.11896/jsjkx.240800017

• 人工智能 • 上一篇    下一篇

基于混合检索增强的双塔模型研究

郜洪奎, 马瑞祥, 包骐豪, 夏少杰, 瞿崇晓   

  1. 中国电子科技集团公司第五十二研究所 杭州 311100
  • 收稿日期:2024-08-05 修回日期:2024-09-26 出版日期:2025-06-15 发布日期:2025-06-11
  • 通讯作者: 郜洪奎(ghk_fwy@126.com)

Research on Hybrid Retrieval-augmented Dual-tower Model

GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao   

  1. The 52nd Research Institute of China Electronics Technology Group Corporation,Hangzhou 311100,China
  • Received:2024-08-05 Revised:2024-09-26 Online:2025-06-15 Published:2025-06-11
  • About author:GAO Hongkui,born in 1988,postgra-duate.His main research interests include large-scale models in decision-making fields and technologies for intelligent gaming and planning.

摘要: 在知识检索的前沿领域,尤其是在大语言模型的应用场景下,研究焦点集中在用纯向量检索技术来高效捕获相关信息,继而将这些信息送入大语言模型进行综合提炼和概括。然而,这种方法的局限性在于,仅依赖向量表示可能无法全面把握检索的复杂性,且缺乏有效的排序机制,常使得无关信息冗余,进而削弱了最终答案与用户实际需求的匹配度。为解决这一难题,提出了基于混合检索增强的双塔模型。此模型创新性地融合了多路径召回策略,通过多样化的召回机制互补,确保检索结果既全面又高度相关。模型架构上,采用双层结构,结合了双向循环神经网络与文本卷积神经网络,使得模型可以对检索结果进行多层次的排序优化,极大地提高了结果的相关性和顶部结果的精确度。更进一步,将经过高效排序的高质量信息与原始查询一同送入大语言模型,充分利用其深层次的分析功能,生成更为精准和可信的答案。实验结果表明,提出的方法有效提升了检索的准确性和系统的整体性能,极大地增强了大语言模型在实际应用中的准确度和实用性。

关键词: 知识搜索, 大语言模型, 向量检索技术, 混合检索增强的双塔模型, 多路径召回策略

Abstract: In the vanguard of knowledge retrieval,particularly in scenarios involving large language models(LLMs),research emphasis has shifted toward employing pure vector retrieval techniques for efficient capture of pertinent information.This information is then fed into large language models for comprehensive distillation and summarization.However,the limitations of this approach lie in its potential inability to fully encompass the intricacies of retrieval through vector representations alone,coupling with an absence of effective ranking mechanisms.This often leads to an overabundance of irrelevant information,thereby diluting the alignment between the final response and the user's actual needs.To address this conundrum,this paper introduces a hybrid retrieval-augmented dual-tower model.This model innovatively integrates a multi-path recall strategy,ensuring that the retrieval results are both comprehensive and highly relevant through complementary recall mechanisms.Architecturally,it adopts a dual-la-yer structure,combining bidirectional recurrent neural networks with text convolutional neural networks.This allows the model to perform multi-level ranking optimization on retrieval results,significantly enhancing the relevance and the precision of top-ranking outcomes.Moreover,the high-quality information,efficiently ranked,is integrated with the original query and fed into a large language model.This exploits the model's deep analytical capabilities to generate more accurate and credible responses.Experimental findings affirm that the proposed method effectively improves retrieval accuracy and system performance overall,markedly enhancing the precision and practicality of large language models in real-world applications.

Key words: Knowledge search, Large language models, Vector retrieval technology, Hybrid retrieval-augmented dual-tower model, Multi-path recall strategy

中图分类号: 

  • TP391
[1]OUYANG L,WU J,JIANG X,et al.Training language models to follow instructions with human feedback[J].Advances in Neural Information Processing Systems,2022,35:27730-27744.
[2]ACHIAM J,ADLER S,AGARWAL S,et al.GPT-4 technicalreport[J].arXiv:2303.08774,2023.
[3]GAO Y,XIONG Y,GAO X,et al.Retrieval-augmented generation for large language models:A survey[J].arXiv:2312.10997,2023.
[4]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive NLP tasks[C]//Procee-dings of the 34th International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc.,2020:9459-9474.
[5]YORAN O,WOLFSON T,RAM O,et al.Making retrieval-augmented language models robust to irrelevant context[J].arXiv:2310.01558,2023.
[6]YU W,ZHANG H PAN,X,et al.2023.Chain-of-note:Enhancing robustness in retrieval-augmented language models[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.ACL,2024:14672-14685.
[7]LUO Y,YANG Z,MENG F,et al.An empirical study of catastrophic forgetting in large language models during continual fine-tuning[J].arXiv:2308.08747,2023.
[8]IZACARD G,GRAVE E.Leveraging passage retrieval withgenerative models for open do-main question answering[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.ACL,2021:874-880.
[9]KARPUKHIN V,OGUZ B,MIN S,et al.Dense passage retrie-val for open-domain question answering[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.ACL,2020:6769-6781.
[10]SHI F,CHEN X,MISRA K,et al.Large language models can be easily distracted by irrelevant context[C]//Proceedings of the 40th International Conference on Machine Learning.JMLR,2023:31210-31227.
[11]GAO T,YEN H,YU J,et al.Enabling large language models to generate text with citations[J].arXiv:2305.14627,2023.
[12]ASAI A,WU Z,WANG Y,et al.Self-rag:Learning to retrieve,generate,and critique through self-reflection[J].arXiv:2310.11511,2023.
[13]PRESS O,ZHANG M,MIN S,et al.Measuring and narrowing the compositionality gap in language models[J].arXiv:2210.03350,2022.
[14]XU S,PANG L,SHEN H,et al.Search-in-the-chain:To-wards the accurate,credible and traceable contengeneration for complex knowledge-intensive tasks[J].arXiv:2304.14732,2023.
[15]DHULIAWALA S,KOMEILI M,XU J,et al.Chain-of-verification reduces hallucination in large language modelss[J].arXiv:2309.11495,2023.
[16]CHEN J,XIAO S,ZHANG P,et al.BGE M3-embedding:Multi-lingual,multi-functionality,multi-granularity text embeddings through self-knowledge distillation[J].arXiv:2402.03216,2024.
[17]ROBERTSON S,ZARAGOZA H.The probabilistic relevanceframework:BM25 and beyond[J].Foundations and Trends in Information Retrieval,2009,3(4):333-389.
[18]MIHALCEA R,TARAU P:TextRank:Bringing order into text[C]//Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing.ACL,2024:404-411.
[19]ZOU L,ZHANG S,CAI H,et al.Pre-trained language model based ranking in baidu search[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.New York:ACM,2021:4014-4022.
[20]FAN Y,XIE X,CAI Y,et al.Pre-training methods in information retrieval[J].Foundations and Trends in Information Retrieval,2022,16(3):178-317.
[21]HOCHREITER S,SCHMIDHUBER J.Long short-term memory[J].Neural Computation,1997,9(8):1735-1780.
[22]KIM Y.Convolutional Neural Networks for Sentence Classification[J].arXiv:1408.5882,2014.
[23]BAI J,BAI S,CHU Y,et al.Qwen technical report[J].arXiv:2309.16609,2023.
[24]RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:100 000+ questions for machine comprehension of text[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.ACL,2016:2383-2392.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!