计算机科学 ›› 2014, Vol. 41 ›› Issue (6): 142-147.doi: 10.11896/j.issn.1002-137X.2014.06.028

• 软件与数据库技术 • 上一篇    下一篇

PBPP:列存储系统中基于传递块缓冲区的流水线并行处理

丁祥武,张光辉   

  1. 东华大学计算机与科学技术学院 上海201620;东华大学计算机与科学技术学院 上海201620
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受“核高基”国家科技重大专项基金项目(2010ZX01042-001-003-004),国家自然科学基金项目(61070031,61070032),上海市自然科学基金项目(11ZR1401200)资助

PBPP:Pipelined Parallel Processing Based on Passing Buffer in Column-store System

DING Xiang-wu and ZHANG Guang-hui   

  • Online:2018-11-14 Published:2018-11-14

摘要: 片上多核处理器(CMP)凭借其低功耗和低成本等优势迅速成为处理器市场的主角,它为多线程的实现提供了硬件支持。列存储技术在分析型应用中具有显著的优势。在列存储系统中,查询优化依然是最重要的问题之一。在列存储系统中,利用多核资源提高查询处理性能具有较大潜力。文中通过对查询执行器生成的物理查询树进行流水多线程设计,结合列存储的特点,建立传递块缓冲区,使主线程和辅助线程分别对传递块缓冲区读写,以提高查询性能。同时还提出使用操作系统中经典的“生产者和消费者”模式来解决线程之间的同步问题。提出的这些方法应用在实验室研发的列存储系统DWMS中,使用数据仓库基准测试集SSB验证了这些方法的有效性。实验结果表明,传递块缓冲区的设计使SQL的查询效率有了近50%的提升。

关键词: 多线程,多核,列存储,传递块缓冲区,并行处理 中图法分类号TP311文献标识码A

Abstract: Chip multiprocessor(CMP) with low-power dissipation,lowcost advantages becomes rapidly the leading role of the market,and it provides hardware support for multithread.Column-store has significant advantages in analytical applications.Query optimization is one of the key issues in column-store.In column-store,multi-core resources can improve performance of query processing.In order to improve query performance of column-stores,this paper established passing block buffer to make main thread and worker thread to read and write respectively different passing blocks,so parent node and child node of physical execution tree execute parallel.We used classic producer-consumer pattern to solve the problem of synchronization between the threads.In column-stores DWMS developed by our laboratory,experimental results on benchmark data set SSB show the effectiveness of this design,and it can improve 50% execution performance for some typical complex queries.

Key words: Multithread,Multicore,Column-store,Passing block buffer,Parallel processing

[1] Copeland G P,Khoshafian S N.A decomposition storage model[C]∥Proceedings of SIGMOD International Conference.Austin,Taxes:ACM,1985:268-279
[2] MacNicol R,French B.Sybase IQ multiplex-Designed for analy-tics[C]∥Proceedings of the 30th Very Large Data Base Confe-rence.Toronto,Canada:VLDB Endowment,2004:1227-1230
[3] Stonebraker M,Abadi D J,Batkin A,et al.C-Store:A column-oriented DBMS[C]∥Proceedings of the 31st Very Large Data Base Conference.Trondheim,Norway:VLDB Endowment,2005:553-564
[4] Boncz P A.Monet:A next-generation DBMS kernel for query-intensive applications[D].Amsterdam:Universiteit van Amsterdam,2002
[5] Hennessy J L,Patterson D A.Computer Architecture(4th ed)[M].Morgan Kaufman Publishers,2007
[6] Abadi D J.Query Execution in Column-Oriented Database System[D].MIT PhD Dissertation,2008
[7] Blanas S,Li Yi-nan,Patel J M.Design and Evaluation of Main Memory Hash Join Algorithms for Multi-core CPUs[C]∥Proceedings of the ACM SIGMOD Conference.Athens,Greece:ACM,2011
[8] Zhou Jing-ren,Cieslewicz J,Ross K A,et al.Improving database performance on simultaneous multithreading processors[C]∥Proceedings of the 31st international conference on Very large data bases.Trondheim,Norway:VLDB Endowment,2005:49-60
[9] Cieslewicz J,Ross K A,Giannakakis I.Parallel buffers for chip multiprocessors[C]∥Proceedings of the 3rd International Workshop on Data Management on new Hardware.New York,USA:ACM,2007
[10] Garcia P,Madison,WiHenry,et al.Pipelined hash-join on multithreaded architectures[C]∥Proceedings of the 3rd internationalworkshop on Data management on new hardware.NewYork,USA:ACM,2007
[11] Nehme R,Bruno N.Automated Partitioning Design in Parallel Database Systems[C]∥ Proceedings of the ACM SIGMOD Conference.Athens,Greece:ACM,2011:1137-1148
[12] 汤子瀛,哲凤屏,汤小丹.计算机操作系统[M].西安:西安电子科技大学出版社,2002
[13] Harizopoulos S,Shkapenyuk V,Anastassia.QPipe:a simultaneously pipelined relational query engine[C]∥Proceedings of the 2005ACM SIGMOD international conference on Management of data.New York,USA:ACM,2005:383-394
[14] Hardavellas N,Pandis I,Johnson R,et al.Database servers onchip multiprocessors:Limitations and opportunities[C]∥CIDR.2007:79-87
[15] Manegold S,Boncz P,Nes N,et al.Cache-conscious radix-de-cluster projections[C]∥Proceedings of the Thirtieth International Conference on Very Large Data Bases.VLDB Endowment,2004:684-695
[16] Rao Jun,Zhang Chun,Megiddo N,et al.Automating physical database design in a parallel database [C]∥Proceedings of the 2002ACM SIGMOD Interna-tional Conference on Management of data.New York,USA:ACM,2002:558-569
[17] Cieslewicz J,Mee W,Ross K A.Cache-conscious buffering for database operators with state [C]∥Proceedings of the Fifth International Workshop on Data Management on New Hardware.New York,USA:ACM,2009:43-51
[18] 吴峻峰,许跃生,张永东,等.CC$:一种面向分布式众核平台的并行编程语言[J].计算机科学,2013,40(3):128-132
[19] O’Neil P,O’Neil B,Chen Xue-dong.Star schema bench-mark Revision 3June 5[EB/OL].http://www.cs.umb.edu/~poneil/,2010-02

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!