计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 52-63.doi: 10.11896/jsjkx.250700096

• 基于AGI技术的智能信息系统 • 上一篇    下一篇

SQL-MARS:面向用户模糊需求的 Text2SQL 结构化数据推荐系统

徐嘉雯1, 郑云贵1, 周伟1, 徐尧强2, 胡卉芪1, 周烜1   

  1. 1 华东师范大学数据科学与工程学院 上海 200062
    2 国家电网有限公司华东分部 上海 200120
  • 收稿日期:2025-07-15 修回日期:2025-10-28 发布日期:2026-03-12
  • 通讯作者: 胡卉芪(hqhu@dase.ecnu.edu.cn)
  • 作者简介:(51275903078@stu.ecn)
  • 基金资助:
    国家自然科学基金重点支持项目(92270202)

SQL-MARS:Text-to-SQL Structured Data Recommendation System for Ambiguous UserRequirements

XU Jiawen1, ZHENG Yungui1, ZHOU Wei1, XU Yaoqiang2, HU Huiqi1, ZHOU Xuan1   

  1. 1 School of Data Science and Engineering, East China Normal University, Shanghai 200062, China
    2 East China Branch of State Grid Corporation of China, Shanghai 200120, China
  • Received:2025-07-15 Revised:2025-10-28 Online:2026-03-12
  • About author:XU Jiawen,born in 2001,postgraduate.Her main research interest is LLM-powered intelligent database management.
    HU Huiqi,born in 1988,Ph.D,professor,Ph.D supervisor.His main research interests include next-generation data management systems and intelligent systems.
  • Supported by:
    National Natural Science Foundation of China(92270202).

摘要: 随着大语言模型(Large Language Model,LLM)技术的成熟,基于自然语言的数据库交互系统(如Chat2DB,ChatExcel)已实现广泛应用。然而,现有系统普遍依赖于“精确查询”假设,难以应对现实场景中广泛存在的模糊需求,即用户需要在与系统交互的过程中明确其查询需求。为了应对这个挑战,提出了SQL-MARS(SQL-oriented Multi-Agent Recommender System)。该系统基于“感知-行动-评估”闭环机制的多智能体协同框架,实现面向数据库模糊查询需求的动态判别与自适应处理。系统提出三层元数据架构建模用户需求以实现模糊感知。在此基础上,系统实现数据导航功能,根据用户的模糊需求分粒度向用户推荐查询建议,渐进式地引导用户澄清查询需求。同时,系统提出外部资料与本地数据融合机制,充分利用外部资料中有价值的信息。此外,还创建了Bird-fuzzy模糊需求数据集,系统实现了自动化评估。实验结果表明,SQL-MARS能够识别模糊需求并有效引导用户澄清数据需求。

关键词: 模糊需求, 多智能体协作, 元数据分层, 数据导航, 外部资料融合

Abstract: With the maturity of LLM technology,natural language-based database interaction systems(e.g.,Chat2DB,ChatExcel) have achieved wide application.However,existing systems generally rely on the “precise query” assumption and struggle to handle the ubiquitous ambiguous requirements in real-world scenarios,where users need to clarify their query needs during interaction with the system.To address this challenge,this paper proposes SQL-MARS(SQL-oriented Multi-Agent Recommender System),a multi-agent collaborative framework based on a “perception-action-evaluation” closed-loop mechanism for dynamic identification and adaptive processing of ambiguous database query requirements.The system introduces a three-layer metadata architecture to model user’s requirements for ambiguous awareness.Based on this,it implements data navigation function,providing query recommendations at varying granularities based on users’ ambiguous requirements to progressively guide them in clari-fying their query needs.Additionally,the system proposes the fusion mechanism between external knowledge and local data to fully utilize valuable information from external sources.We alsocreate the dataset named Bird-fuzzy for ambiguous requirements and implements automated evaluation.Experimental results show that SQL-MARS can effectively identify ambiguous requirements and guide users to clarify their data needs.

Key words: Ambiguous requirements, Multi-agent collaboration, Hierarchical metadata, Data navigation, External knowledge fusion

中图分类号: 

  • TP311
[1]ZHAO X,ZHOU X,LI G.Chat2Data:An Interactive DataAnalysis System with RAG,Vector Databases and LLMs[C]//Proceedings of the VLDB Endowment.ACM,2024:4481-4484.
[2]LI J,HUI B,QU G,et al.Canllm already serve as a database interface? a big bench for large-scale database grounded text-to-sqls[C]//Proceedings of Curran Associates Inc.2023.
[3]WU Q,BANSAL G,ZHANG J,et al.AutoGen:Enabling Next-Gen LLM Applications via Multi-Agent Conversations[C]//Proceedings of the First Conference on Language Modeling.2024.
[4]HONG S,ZHUG M,CHEN J,et al.MetaGPT:Meta Programming for Multi-Agent Collaborative Framework[C]//Procee-dings of the 12th International Conference on Learning Representations(ICLR 2024).ICLR,2024.
[5]WANG B,ZHANG Y,LI X,et al.MAC-SQL:A Multi-Agent Collaborative Framework for Text-to-SQL[C]//Proceedings of the 12th International Conference on Learning Representations(ICLR 2024).ICLR,2024.
[6]POURREZA M,RAFIEI D,LI B,et al.CHASE-SQL:Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL[C]//Proceedings of the 13th International Conference on Learning Representations(ICLR 2025).ICLR,2025.
[7]YAO S,ZHENG S,YU B,et al.ReAct:Synergizing Reasoning and Acting in Language Models[C]//Proceedings of the 11th International Conference on Learning Representations(ICLR 2023).ICLR,2023.
[8]SHINN N,CASSANO F,GOPINATH A,et al.Reflexion:Language Agents with Verbal Reinforcement Learning[C]//Advances in Neural Information Processing Systems.2023:8634-8652.
[9]TANG N,FAN J,LI F Y,et al.RPT:Relational Pre-trainedTrans-former Is Almost All You Need towards Democratizing Data Preparation[C]//Proceedings of the VLDB Endowment.2021:1254-1261.
[10]ZHOU X,SUN Z,LI G.DB-GPT:Large Language Model Meets Database[J].Data Science and Engineering,2024,9(1):102-111.
[11]ZHAO F H,LIM L,AHMAD I,et al.LLM-SQL-Solver:Can LLMs Determine SQL Equivalence?[J].arXiv:2312.10321,2023.
[12]ZHOU X,LI G.D-Bot:Database Diagnosis System Using Large Language Models[J].arXiv:2312.01454,2023.
[13]CHANG S,FOSLER-LUSSIER E.How to Prompt LLMs forText-to-SQL:A Study in Zero-Shot,Single-Domain,and Cross-Domain Settings[C]//NeurIPS 2023 Second Table Representation Learning Workshop.2023.
[14]ZHANG B,YE Y,HU X,et al.Benchmarking the Text-to-SQL Capability of Large Language Models:A Comprehensive Evaluation[J].arXiv:2403.02951,2024.
[15]XIA H,JIANG F,DENG N,et al.SQL-Craft:Text-to-SQLThrough Interactive Refinement and Enhanced Reasoning[J].arXiv:2402.14851v1,2024.
[16]DONG X,ZHANG C,GE Y,et al.C3:Zero-Shot Text-to-SQL with ChatGPT[J].arXiv:2307.07306,2023.
[17]POURREZA M,RAFIEI D.DIN-SQL:Decomposed In-Context Learning of Text-to-SQL with Self-Correction[C]//Advances in Neural Information Processing Systems.Curran Associates Inc.,2023:36339-36348.
[18]XIE Y,JIN X,XIE T,et al.Decomposition for Enhancing Atten-tion:Improving LLM-Based Text-to-SQL through Workflow Paradigm[C]//Findings of the Association for Computational Linguistics:ACL 2024.ACL,2024:10796-10816.
[19]DENG M,XU C,HU L,et al.ReFoRCE:A Text-to-SQL Agent with Self-Refinement,Format Restriction,and Column Exploration[C]//ICLR 2025 Workshop:VerifAI:AI Verification in the Wild.International Conference on Learning Representations.2025.
[20]TAI C,CHEN Z,ZHANG T,et al.Exploring Chain of Thought Style Prompting for Text-to-SQL[C]//Conference on Empirical Methods in Natural Language Processing.ACL,2023:5376-5393.
[21]FRANCISCATTO M,DEL FABRO M,TROIS C,et al.Talk to Your Data:A Chatbot System for Multidimensional Datasets[C]//2022 IEEE 46th Annual Computers,Software,and Applications Conference.IEEE Computer Society,2022:486-495.
[22]CHAUDHURI R.Automated Question Generation on Tabular Data for Conversational Data Exploration[J].arXiv:2407.12859,2024.
[23]MANATKAR A,AKELLA A,GUPTA P,et al.QUIS:Question-Guided Insights Generation for Automated Exploratory Data Analysis[C]//2024 Conference on Empirical Methods in Na-tural Language Processing:Industry Track.ACL.2024:1523-1535.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!