计算机科学 ›› 2022, Vol. 49 ›› Issue (4): 110-115.doi: 10.11896/jsjkx.210200173
所属专题: 大数据&数据科学 虚拟专题
曹合心1, 赵亮2, 李雪峰1
CAO He-xin1, ZHAO Liang2, LI Xue-feng1
摘要: 语义解析领域中的Text-to-SQL 任务对实现基于数据库的自动问答具有重要意义。现有深度学习模型,如Seq2Seq的序列生成模型在单表SQL查询中已取得显著效果,但无法解决多表SQL查询的问题。图神经网络能够有效提取数据库表和问句之间的关联信息,丰富解析过程中的语义信息,从而提升多表SQL查询的准确率。文中提出一种自适应的图构建方式和图编码方式,在现有Text-to-SQL 模型中引入问句信息,通过对问句和数据库的拼接词向量进行卷积操作生成图网络初始化权重,对同种类型下的不同数据库可实现统一训练。采用IRNet框架和关系扩充的方式进行整体模型设计,在当前开放的Text-to-SQL数据集Spider上进行验证。结果表明,该技术能够有效提升多表SQL语句生成的匹配准确率,同时算法对图神经网络在Text-to-SQL领域的研究具有重要的参考价值。
中图分类号:
[1] LBAIK C,JAGADISH H V,LI Y Y.Bridging the semantic gap with SQL query logs in natural language interfaces to databases[J].arXiv:1902.00031,2019. [2] WU Z,PAN S,CHEN F,et al.A Comprehensive Survey onGraph Neural Networks[J].IEEE Transactions on Neural Networks and Learning Systems,2020(99):1-21. [3] ANDROUTSOPOULOS I,RITCHIE G D,THANISCH P.Na-tural Language Interfaces to Databases-An Introduction[J].Natural Language Engineering,1995,1(1):29-81. [4] POPESCU A M, ARMANASU A, ETZIONI O,et al.Modern natural language interfaces to databases:Composing statistical parsing with semantic tractability[C]//COLING.2004. [5] UNGER C,BÜHMANN L,LEHMANN J,et al.Template- based question answering over RDF data[C]//Proceedings of the 21st World Wide Web Conference.New York:ACM,2012:639-648. [6] LI F,JAGADISH H V.Constructing an interactive natural language interface for relational databases[J].Proceedings of the VLDB Endowment,2014,8(1):73-84. [7] ZHONG V,XIONG C,SOCHER R.Seq2SQL:GeneratingStructured Queries from Natural Language using Reinforcement Learning[J].arXiv:1709.0010.3v7. [8] YU T,LI Z F,ZHANG Z L,et al.TypeSQL:Knowledge-Based Type-Aware Neural Text-to-SQL Generation[J].arXiv:1804.09769v1,2018. [9] YU T,MICHIHIRO Y,KAI Y,et al.SyntaxSQLNet:Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task[J].arXiv:1810.05237v1,2018. [10] YU T,ZHANG R,YANG K,et al.Spider:A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task [J].arXiv:1809.08887v5,2019. [11] CAO J C,HUANG T,CHEN G,et al.Research on Natural Language Generating Multi-table SQL Query Statement Technology[J].Journal of Computer Science and Research,2020(7):1133-1141. [12] ZHANG R,YU T,ER H Y,et al.Editing-Based SQL QueryGeneration for Cross-Domain Context-Dependent Questions[C]//Proceedings of the2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).2019. [13] BOGIN B, GARDNER M, BERANT J.Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing[J].arXiv:1905,06241v2,2019. [14] BOGIN B,GARDNER M,BERANT J.Global Reasoning overDatabase Structures for Text-to-SQL Parsing[J].arXiv:1908.11214v1,2019. [15] XIE J X,GAN Y J,YU G J.Design and Implementation of Algorithms for Converting SQL Query Statements into Graph Structures[J].Electromechanical Information,2020(17):120-123. [16] GUO J,ZHAN Z,GAO Y,et al.Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019. [17] ASHISH V, NOAM S, NIKI P,et al.Attention is All YouNeed[J].arXiv:1706.03762. [18] KIPF T N,WELLING M.Variational Graph Auto-Encoders[J].arXiv:1611.07308v1. [19] YUAN Z X,REN D D,HONG X D,et al.Research on Question Understanding Method Combining Database Structure and Content[J].Computer Engineering,2021,47(3):71-76,82. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[3] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[4] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[5] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[6] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[7] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[8] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[9] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[10] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[11] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[12] | 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫. 小样本雷达辐射源识别的深度学习方法综述 Survey of Deep Learning for Radar Emitter Identification Based on Small Sample 计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138 |
[13] | 齐秀秀, 王佳昊, 李文雄, 周帆. 基于概率元学习的矩阵补全预测融合算法 Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning 计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126 |
[14] | 杨炳新, 郭艳蓉, 郝世杰, 洪日昌. 基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用 Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition 计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070 |
[15] | 胡艳羽, 赵龙, 董祥军. 一种用于癌症分类的两阶段深度特征选择提取算法 Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification 计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092 |
|