计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 52-63.doi: 10.11896/jsjkx.250700096
徐嘉雯1, 郑云贵1, 周伟1, 徐尧强2, 胡卉芪1, 周烜1
XU Jiawen1, ZHENG Yungui1, ZHOU Wei1, XU Yaoqiang2, HU Huiqi1, ZHOU Xuan1
摘要: 随着大语言模型(Large Language Model,LLM)技术的成熟,基于自然语言的数据库交互系统(如Chat2DB,ChatExcel)已实现广泛应用。然而,现有系统普遍依赖于“精确查询”假设,难以应对现实场景中广泛存在的模糊需求,即用户需要在与系统交互的过程中明确其查询需求。为了应对这个挑战,提出了SQL-MARS(SQL-oriented Multi-Agent Recommender System)。该系统基于“感知-行动-评估”闭环机制的多智能体协同框架,实现面向数据库模糊查询需求的动态判别与自适应处理。系统提出三层元数据架构建模用户需求以实现模糊感知。在此基础上,系统实现数据导航功能,根据用户的模糊需求分粒度向用户推荐查询建议,渐进式地引导用户澄清查询需求。同时,系统提出外部资料与本地数据融合机制,充分利用外部资料中有价值的信息。此外,还创建了Bird-fuzzy模糊需求数据集,系统实现了自动化评估。实验结果表明,SQL-MARS能够识别模糊需求并有效引导用户澄清数据需求。
中图分类号:
| [1]ZHAO X,ZHOU X,LI G.Chat2Data:An Interactive DataAnalysis System with RAG,Vector Databases and LLMs[C]//Proceedings of the VLDB Endowment.ACM,2024:4481-4484. [2]LI J,HUI B,QU G,et al.Canllm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls[C]//Proceedings of Curran Associates Inc.2023. [3]WU Q,BANSAL G,ZHANG J,et al.AutoGen:Enabling Next-Gen LLM Applications via Multi-Agent Conversations[C]//Proceedings of the First Conference on Language Modeling.2024. [4]HONG S,ZHUG M,CHEN J,et al.MetaGPT:Meta Programming for Multi-Agent Collaborative Framework[C]//Procee-dings of the 12th International Conference on Learning Representations(ICLR 2024).ICLR,2024. [5]WANG B,ZHANG Y,LI X,et al.MAC-SQL:A Multi-Agent Collaborative Framework for Text-to-SQL[C]//Proceedings of the 12th International Conference on Learning Representations(ICLR 2024).ICLR,2024. [6]POURREZA M,RAFIEI D,LI B,et al.CHASE-SQL:Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL[C]//Proceedings of the 13th International Conference on Learning Representations(ICLR 2025).ICLR,2025. [7]YAO S,ZHENG S,YU B,et al.ReAct:Synergizing Reasoning and Acting in Language Models[C]//Proceedings of the 11th International Conference on Learning Representations(ICLR 2023).ICLR,2023. [8]SHINN N,CASSANO F,GOPINATH A,et al.Reflexion:Language Agents with Verbal Reinforcement Learning[C]//Advances in Neural Information Processing Systems.2023:8634-8652. [9]TANG N,FAN J,LI F Y,et al.RPT:Relational Pre-trainedTrans-former Is Almost All You Need towards Democratizing Data Preparation[C]//Proceedings of the VLDB Endowment.2021:1254-1261. [10]ZHOU X,SUN Z,LI G.DB-GPT:Large Language Model Meets Database[J].Data Science and Engineering,2024,9(1):102-111. [11]ZHAO F H,LIM L,AHMAD I,et al.LLM-SQL-Solver:Can LLMs Determine SQL Equivalence?[J].arXiv:2312.10321,2023. [12]ZHOU X,LI G.D-Bot:Database Diagnosis System Using Large Language Models[J].arXiv:2312.01454,2023. [13]CHANG S,FOSLER-LUSSIER E.How to Prompt LLMs forText-to-SQL:A Study in Zero-Shot,Single-Domain,and Cross-Domain Settings[C]//NeurIPS 2023 Second Table Representation Learning Workshop.2023. [14]ZHANG B,YE Y,HU X,et al.Benchmarking the Text-to-SQL Capability of Large Language Models:A Comprehensive Evaluation[J].arXiv:2403.02951,2024. [15]XIA H,JIANG F,DENG N,et al.SQL-Craft:Text-to-SQLThrough Interactive Refinement and Enhanced Reasoning[J].arXiv:2402.14851v1,2024. [16]DONG X,ZHANG C,GE Y,et al.C3:Zero-Shot Text-to-SQL with ChatGPT[J].arXiv:2307.07306,2023. [17]POURREZA M,RAFIEI D.DIN-SQL:Decomposed In-Context Learning of Text-to-SQL with Self-Correction[C]//Advances in Neural Information Processing Systems.Curran Associates Inc.,2023:36339-36348. [18]XIE Y,JIN X,XIE T,et al.Decomposition for Enhancing Atten-tion:Improving LLM-Based Text-to-SQL through Workflow Paradigm[C]//Findings of the Association for Computational Linguistics:ACL 2024.ACL,2024:10796-10816. [19]DENG M,XU C,HU L,et al.ReFoRCE:A Text-to-SQL Agent with Self-Refinement,Format Restriction,and Column Exploration[C]//ICLR 2025 Workshop:VerifAI:AI Verification in the Wild.International Conference on Learning Representations.2025. [20]TAI C,CHEN Z,ZHANG T,et al.Exploring Chain of Thought Style Prompting for Text-to-SQL[C]//Conference on Empirical Methods in Natural Language Processing.ACL,2023:5376-5393. [21]FRANCISCATTO M,DEL FABRO M,TROIS C,et al.Talk to Your Data:A Chatbot System for Multidimensional Datasets[C]//2022 IEEE 46th Annual Computers,Software,and Applications Conference.IEEE Computer Society,2022:486-495. [22]CHAUDHURI R.Automated Question Generation on Tabular Data for Conversational Data Exploration[J].arXiv:2407.12859,2024. [23]MANATKAR A,AKELLA A,GUPTA P,et al.QUIS:Question-Guided Insights Generation for Automated Exploratory Data Analysis[C]//2024 Conference on Empirical Methods in Na-tural Language Processing:Industry Track.ACL.2024:1523-1535. |
|
||