计算机科学 ›› 2016, Vol. 43 ›› Issue (6): 184-187.doi: 10.11896/j.issn.1002-137X.2016.06.037

• 软件与数据库技术 • 上一篇    下一篇

连接操作在SIMFS和EXT4上的性能比较

赵利伟,陈咸彰,诸葛晴凤   

  1. 重庆大学计算机学院 重庆400044,重庆大学计算机学院 重庆400044,重庆大学计算机学院 重庆400044;信息物理社会可信服务计算教育部重点实验室 重庆400044
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受“863”国家高技术研究发展计划(2013AA013202,5AA015304),国家自然科学基金项目(61472052,4)资助

Performance Comparison of Join Operations on SIMFS and EXT4

ZHAO Li-wei, CHEN Xian-zhang and ZHUGE Qing-feng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 连接操作是关系数据库系统中最基本、最昂贵的操作,对数据库性能有巨大的影响。由于连接表存放在文件系统中,因此文件系统的性能对连接操作的性能有决定性的影响。不同文件系统的连接操作性能测试对数据库研究有重要意义,但目前相关测试较少。首先对比分析了新型内存文件系统SIMFS(Sustainable In-Memory File System)的数据读写路径与磁盘文件系统EXT4(Fourth Extended File System)I/O路径等方面的差异;然后设计实验测试了不同文件系统对连接操作的影响,其中对SIMFS和EXT4分别设置了不同的数据读写块大小和I/O块大小等测试指标。实验表明,连接操作在SIMFS和EXT4上的性能优化、块大小影响、性能提升瓶颈、硬件约束等方面均存在明显差异。在实验结果比较分析的基础上,给出了针对新型内存文件系统连接操作的优化建议。

关键词: 连接操作,内存文件系统,磁盘文件系统,性能优化

Abstract: Join is the most primary and expensive operation in relational database,and it has a great impact on the performance of the database.Since the data tables are stored in file system,the performance of the file system essentially determines the performance of join operation.The tests of join among different file systems have great meaning in database research,but now there are few of such tests.First,the differences between the data access of new in-memory file system SIMFS (Sustainable In-Memory File System) and the I/O path of disk-based file system EXT4 (Fourth extendedfile system) were compared.Then experiments were designed to test the effect of different file systems on join operations.And test metrics such as different data block and I/O block sizes were set for SIMFS and EXT4 respectively.The experimental results show that the join operation on SIMFS and EXT4 has obvious difference in performance optimization ,effect of block size,the bottleneck of performance improvement,constraints of hardware,and so on.Based on the analysis of the experimental results,suggestions on in-memory file system were proposed to optimize the join operations.

Key words: Join operations,In-memory file system,Disk-based file system,Performance optimization

[1] Myers D C.On the Use of NAND Flash Memory in High-Performance Relational Databases[D].Massachusetts Institute of Technology,2008
[2] Wu X,Qiu S,Narasimha Reddy A L.SCMFS:A File System for Storage Class Memory and its Extensions[J].ACM Transactions on Storage (TOS),2013,9(3):1-11
[3] Condit J,Nightingale E B,Frost C,et al.Better I/O through byte-addressable,persistent memory[C]∥Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles.ACM,2009:133-146
[4] Dulloor S R,Kumar S,Keshavamurthy A,et al.System software for persistent memory[C]∥Proceedings of the Ninth European Conference on Computer Systems.ACM,2014:1-15
[5] Li Guan-zhao,Chen Si-tong,Zhen Zhen,et al.Join AlgorithmsBased on Fermi Architecture[J].Computer Science,2013,40(3):62-67(in Chinese) 李观钊,陈思桐,甄真,等.基于Fermi架构的Join算法[J].计算机科学,2013,40(3):62-67
[6] Sha E H M,Chen Xian-zhang,Zhuge Qing-feng,et al.Designing an efficient persistent in-memory file system.http://cacs.cqu.edu.cn/wp-content/uploads/2015/02/TR-2014-02-Designing-an-efficient-persistent-in-memory-file-system.pdf
[7] Ext4 wiki.https://ext4.wiki.kernel.org
[8] Raoux S,Burr G W,Breitwisch M J,et al.Phase-change random access memory:A scalable technology[J].IBM Journal of Research and Development,2008,52(4/5):465-479
[9] Jung M,Shalf J,Kandemir M.Design of a large-scale storage-class RRAM system[C]∥Eugene,Oregon,USA:Proceedings of the 27th International ACM Conference on International Conference on Super Computing.ACM,2013:103-114
[10] Chua L.Resistance switching memories are memristors[J].Applied Physics A,2011,102(4):765-783
[11] Kerman J.Toward a Universal Memory[J].Science,2005,308(5721):508-510
[12] Chen S,Gibbons P B,Nath S.Rethinking Database Algorithms for Phase Change Memory[C]∥ Proceedings of the 5th Biennial Conference on Innovative Data System.2011:21-31
[13] Jung M,Shalf J,Kandemir M.Design of a large-scale storage-class RRAM system[C]∥Proceedings of the 27th International ACM Conference on International Conference on Supercomputing.ACM,Eugene,Oregon,USA,2013:103-114
[14] Liu R,Shen D,Yang C,et al.NVM duet:unified working memory and persistent store architecture[J].ACM Sigplan Notices,2014,2(1):455-470
[15] Jung J,Cho S.Memorage:emerging persistent RAM based malleable main memory and storage architecture[C]∥ Proceedings of the 27th International ACM Conference on International Conference on Supercomputing.ACM,Eugene,Oregon,USA,2013:115-126
[16] Mishra P,Eich M H.Join processing in relational databases[J].ACM Comput.Surv.,1992,24(1):63-113
[17] Elmasri R A,Navathe S B.Fundamentals of Database Systems[M].Addison-Wesley Longman Publishing Co., 1999:1009
[18] Han Xi-xian,Yang Dong-hua,Li Jian-zhong.DBCC-Join:A Novel Cache-Conscious Disk-Based Join Algorithm[J].Chinese Journal of Computers,2010(08):1500-1511(in Chinese) 韩希先,杨东华,李建中.DBCC-Join:一种新的高速缓存敏感的磁盘连接算法[J].计算机学报,2010(8):1500-1511
[19] Do J,Patel J M.Join processing for flash SSDs:rememberingpast lessons[C]∥Proceedings of the Fifth International Workshop on Data Management on New Hardware.ACM,Providence,Rhode Island,2009
[20] Viglas S.Write-limited sorts and joins for persistent memory[J].PVLDB,2014,7(5):413-424
[21] Pang Jun,Yu Ge,Xu Jia,et al.Similarity Joins on Massive Data Based on MapReduce Framework[J].Computer Science,2015,42(1):1-5(in Chinese) 庞俊,于戈,许嘉,等。基于MapReduce框架的海量数据相似性连接研究进展[J].计算机科学,2015,2(1):1-5

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!