计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 109-117.doi: 10.11896/jsjkx.250100151
余阳, 彭煜玮
YU Yang, PENG Yuwei
摘要: 查询优化是关系型数据库中的关键环节。在传统的查询优化过程中,为了获得较优的执行计划,通常需要对查询中的连接和过滤操作进行基数估计。然而,基数估计存在不准确的问题,导致查询优化效果往往不尽如人意。目前,已有部分研究通过基于机器学习的方法改善基数估计问题并取得了一定进展。尽管这些方法在处理查询中数值类型的过滤谓词时表现较好,但对于其他复杂的过滤谓词效果不佳。为解决这一问题,文中提出了一种基于学习排序的查询优化算法。该算法能够为单一查询智能评估多个执行计划并排序,从而选择最佳计划执行。该查询优化算法通过迭代挖掘较优执行计划,并协同机器学习方法,最终筛选出最优计划。实验结果表明,该算法在常规数据集上的性能优于当前基于学习的查询优化算法,并且在复杂数据集中具有更加显著的优势。
中图分类号:
[1]LEIS V,GUBICHEV A,MIRCHEV A,et al.How Good Are Query Optimizers,Really?[J].Proceedings of the VLDB Endowment,2015,9(3):204-215. [2]WU P,CONG G.A Unified Deep Model of Learning from both Data and Queries for Cardinality Estimation[C]//International Conference on Management of Data,Virtual Event(SIGMOD '21).ACM,2021:2009-2022. [3]WU Z,SHAIKHHA A.BayesCard:A Unified Bayesian Framework for Cardinality Estimation[J].arXiv:2012.14743,2020. [4]YANG Z,KAMSETTY A,LUAN S,et al.NeuroCard:One Cardinality Estimator for All Tables[J].Proceedings of the VLDB Endowment,2020,14(1):61-73. [5]YANG Z,LIANG E,KAMSETTY A,et al.Deep Unsupervised Cardinality Estimation[J].Proceedigns of the VLDB Endowment,2019,13(3):279-292. [6]ZHANG J,ZHANG C,LI G,et al.AutoCE:An Accurate and Efficient Model Advisor for Learned Cardinality Estimation[C]//39th IEEE International Conference on Data Engineering(ICDE 2023).IEEE,2023:2621-2633. [7]ZHU R,WU Z,HAN Y,et al.FLAT:Fast,Lightweight andAccurate Method for Cardinality Estimation[J].Proceedings VLDB Endowment,2021,14(9):1489-1502. [8]MARCUS R,NEGI P,MAO H,et al.Bao:Making LearnedQuery Optimization Practical[C]//SIGMOD.ACM,2021:1275-1288. [9]ZHU R,CHEN W,DING B,et al.LERO:A Learning-to-Rank Query Optimizer[J].Proceedings of the VLDB Endowment,2023,16(6):1466-1479. [10]CHEN X,CHEN H,LIANG Z,et al.LEON:A new framework for ml-aided query optimization[J].Proceedings VLDB Endowment,2023,16:2261-2273. [11]LAN H,BAO Z,PENG Y.A Survey on Advancing the DBMS Query Optimizer:Cardinality Estimation,Cost Model,and Plan Enumeration[J].Data Sci.Eng.,2021,6(1):86-101. [12]OLKEN F,ROTEM D.Random Sampling from Database Files:A Survey[C]//Statistical and Scientific Database Management,5th International Conference SSDBM,Charlotte,NC(Lecture Notes in Computer Science,Vol.420).Springer,1990:92-111. [13]CORMODE G,GAROFALAKIS M N,HAAS P J,et al.Synopses for Massive Data:Samples,Histograms,Wavelets,Sketches[J].Found.Trends Databases,2012,4(1/2/33):1-294. [14]HASAN S,THIRUMURUGANATHAN S,AUGUSTINE J,et al.Deep Learning Models for Selectivity Estimation of Multi-Attribute Queries[C]//Proceedings of the 2020 International Conference on Management of Data,SIGMOD Conference 2020.ACM,2020:1035-1050. [15]HILPRECHT B,SCHMIDT A,KULESSA M,et al.DeepDB:Learn from Data,not from Queries![J].Proceedings of the VLDB Endowment,2020,13(7):992-1005. [16]KIPF A,KIPF T,RADKE B,et al.Learned Cardinalities:Estimating Correlated Joins with Deep Learning[J].arXiv:1809.00677,2018. [17]SUN J,LI G,TANG N.Learned Cardinality Estimation for Similarity Queries[C]//International Conference on Management of Data(SIGMOD '21).ACM,2021:1745-1757. [18]ILYAS I F,BESKALES G,SOLIMAN M A.Probabilistic query optimization in database systems[J].ACM SIGMOD Record,2008,37(3):4-11. [19]ROY D,PANDA P,ROY K.Tree-CNN:A Deep Convolutional Neural Network for Lifelong Learning[J].arXiv:1802.05800,2018. [20]MOU L,LI G,ZHANG L,et al.Convolutional Neural Networks over Tree Structures for Programming Language Processing[C]//AAAI.AAAI Press,2016:1287-1293. [21]MARCUS R,PAPAEMMANOUIL O.Plan-Structured DeepNeural Network Models for Query Performance Prediction[J].Proceedings VLDB Endowment,2019,12(11):1733-1746. |
|