计算机科学 ›› 2022, Vol. 49 ›› Issue (8): 49-55.doi: 10.11896/jsjkx.210700074

• 数据库&大数据&数据科学* 上一篇    下一篇

基于物理操作级模型的查询执行时间预测方法

王润安, 邹兆年   

  1. 哈尔滨工业大学计算机科学与技术学院 哈尔滨 150001
  • 收稿日期:2021-07-07 修回日期:2021-12-09 发布日期:2022-08-02
  • 通讯作者: 邹兆年(znzou@hit.edu.cn)
  • 作者简介:(20s003041@stu.hit.edu.cn)
  • 基金资助:
    国家自然科学基金(62072138);之江实验室开放课题(2021KC0AB02)

Query Performance Prediction Based on Physical Operation-level Models

WANG Run-an, ZOU Zhao-nian   

  1. School of Computer Science and Technology,Harbin Institute of Technology,Harbin 150001,China
  • Received:2021-07-07 Revised:2021-12-09 Published:2022-08-02
  • About author:WANG Run-an,born in 1998,postgra-duate,is a student member of China Computer Federation.His main research interests include database systems and so on.
    ZOU Zhao-nian,born in 1979,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include database systems and big data analysis.
  • Supported by:
    National Natural Science Foundation of China(62072138) and Open Research Projects of Zhejiang Lab(2021KC0AB02).

摘要: 查询执行时间预测(Query Performance Prediction,QPP)是数据库系统中一个重要的研究问题。当数据库系统中存在并发执行的事务时,现有的QPP方法无法在不改变数据库查询性能的前提下建立准确的QPP模型。为此,提出了一种基于物理操作的查询执行时间预测新方法,该方法以查询的物理操作为单位建立单元预测模型,根据查询计划将单元预测模型组合为完整的QPP模型,把能够刻画数据库系统并发状态的统计信息纳入模型的输入特征。所提方法只须使用DBMS提供的基本手段即可获取构建模型所需的数据库统计信息,无须改变DBMS,也不会影响数据库系统上原有工作负载的执行。实验结果表明,所提方法无论在OLTP还是OLAP应用中,在不同的查询计划和并发度下的预测准确性均高于其他对比方法。

关键词: 查询计划, 查询执行时间预测(QPP), 神经网络, 数据库系统状态, 物理操作

Abstract: Query performance prediction (QPP) is an important issue in database systems.When there are concurrent transactions in a database system,the existing methods fail to establish an accurate model without changing query performance.In this paper,a new method is proposed to solve the QPP problem.The proposed method builds unit prediction models for various physical operations in the query and combines the unit models into a complete QPP model according to the query plan.It can describe the concurrency state of the database system by taking the statistical information as features.The proposed method only needs to use the basic means provided by the DBMS to obtain the database statistics required to build the model,without changing the DBMS or affecting the execution of the original workloads on the database system.We evaluate our techniques on various workloads including OLTP and OLAP.Experiments show that the proposed method outperforms the state-of-art QPP methods regardless of different query plans or different concurrency.

Key words: Database system status, Neural network, Physical operation, Query performance prediction (QPP), Query plan

中图分类号: 

  • TP392
[1]ZOLAKTAF Z,MILANI M,POTTINGER R.Facilitating SQL Query Composition and Analysis[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data.Association for Computing Machinery,2020:209-224.
[2]HUAWEI.openGauss Database[EB/OL].https://opengauss.org/zh/.
[3]BABCOCK B,CHAUDHURI S.Towards a robust query optimizer:A principled and practical approach[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data.2005:119-130.
[4]MARCUS R,PAPAEMMANOUIL O.Plan-structured deepneural network models for query performance prediction[C]//Proceedings of the VLDB Endowment.2019:1733-1746.
[5]DUGGAN J,PAPAEMMANOUIL O,UPFAL E.Performance Prediction for Concurrent Database Workloads[C]//Procee-dings of the ACM SIGMOD International Conference on Management of Data.2011:337-348.
[6]ZHOU X,SUN J,LI G,et al.Query performance prediction for concurrent queries using graph embedding[J].Proceedings of the VLDB Endowment,2020,13(9):1416-1428.
[7]WU W,CHI Y,ZHU S,et al.Predicting query execution time:Are optimizer cost models really unusable?[C]//International Conference on Data Engineering.2013:1081-1092.
[8]XU X,LIU C,SONG D.SQLNet:Generating Structured Queries from Natural Language Without Reinforcement Learning[J].arXiv:1711.04436,2017.
[9]AKDERE M,ÇETINTEMEL U,RIONDATO M,et al.Lear-ning-based query performance modeling and prediction[C]//International Conference on Data Engineering.2012:390-401.
[10]WU W,CHI Y,HACIGÜMÜS H,et al.Towards predictingquery execution time for concurrent and dynamic database workloads[J].Proceedings of the VLDB Endowment,2013,6(10):925-936.
[11]ABOULNAGA A,CHAUDHURI S.Self-tuning histograms[J].ACM SIGMOD Record,1999,28(2):181-192.
[12]KIPF A,KIPF T,RADKE B,et al.Learned Cardinalities:Estimating Correlated Joins with Deep Learning[C]//9th Biennial Conference on Innovative Data Systems Research.Asilomar,CA,USA,2019:13-16.
[13]TAN J,ZHANG T,LI F,et al.iBTune:Individualized Buffer Tuning for Large-scale Cloud Databases[J].Proceedings of the VLDB Endowment,2019,12(10):1221-1234.
[14]ZHANG J,LIU Y,ZHOU K,et al.An end-to-end automatic cloud database tuning system using deep reinforcement learning[C]//Proceedings of the ACM SIGMOD International Confe-rence on Management of Data.Association for Computing Machinery.2019:415-432.
[15]MARCUS R,PAPAEMMANOUIL O.Deep ReinforcementLearning for Join Order Enumeration[C]//Proceedings of the First International Workshop on Exploiting Artificial Intelligence Techniques for Data Management.New York,USA:ACM,2018:1-4.
[16]POSTGRESQ L.Postgres monitoring stats[EB/OL].https://www.postgresql.org/docs/13/monitoring-stats.html.
[17]LEIS V,GUBICHEV A,MIRCHEV A,et al.How Good Are Query Optimizers,Really?[J].Proceedings of the VLDB Endowment,2016,9(3):204-215.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[3] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[4] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[5] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[6] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[7] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[8] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[9] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[10] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
[11] 彭双, 伍江江, 陈浩, 杜春, 李军.
基于注意力神经网络的对地观测卫星星上自主任务规划方法
Satellite Onboard Observation Task Planning Based on Attention Neural Network
计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093
[12] 费星瑞, 谢逸.
基于HMM-NN的用户点击流识别
Click Streams Recognition for Web Users Based on HMM-NN
计算机科学, 2022, 49(7): 340-349. https://doi.org/10.11896/jsjkx.210600127
[13] 赵冬梅, 吴亚星, 张红斌.
基于IPSO-BiLSTM的网络安全态势预测
Network Security Situation Prediction Based on IPSO-BiLSTM
计算机科学, 2022, 49(7): 357-362. https://doi.org/10.11896/jsjkx.210900103
[14] 齐秀秀, 王佳昊, 李文雄, 周帆.
基于概率元学习的矩阵补全预测融合算法
Fusion Algorithm for Matrix Completion Prediction Based on Probabilistic Meta-learning
计算机科学, 2022, 49(7): 18-24. https://doi.org/10.11896/jsjkx.210600126
[15] 杨炳新, 郭艳蓉, 郝世杰, 洪日昌.
基于数据增广和模型集成策略的图神经网络在抑郁症识别上的应用
Application of Graph Neural Network Based on Data Augmentation and Model Ensemble in Depression Recognition
计算机科学, 2022, 49(7): 57-63. https://doi.org/10.11896/jsjkx.210800070
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!