计算机科学 ›› 2016, Vol. 43 ›› Issue (7): 197-202.doi: 10.11896/j.issn.1002-137X.2016.07.036

• 软件与数据库技术 • 上一篇    下一篇

面向Cassandra数据库的高效动态数据管理机制

王博千,于齐,刘辛,沈立,王志英,陈微   

  1. 国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073,国防科学技术大学计算机学院 长沙410073
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金项目(61472431,61202121),教育部高等学校博士点新教师基金项目(20114307120013)资助

Efficient and Dynamic Data Management System for Cassandra Database

WANG Bo-qian, YU Qi, LIU Xin, SHEN Li, WANG Zhi-ying and CHEN Wei   

  • Online:2018-12-01 Published:2018-12-01

摘要: Cassandra数据库是当前通用的数据库之一,同时也被Apache列为重点发展的顶级项目。针对Cassandra分布式数据库系统而言,大量的写请求会造成过多分散的SStable结构以及过高的数据冗余度,进而造成用户读取请求响应速度下降。该问题可以通过系统自动触发的局部数据合并机制或人为干预的整体数据合并机制来解决。然而,不合时机的系统自动局部合并过程会严重降低用户正在执行的读取操作的性能,而过长时间的人为整体数据合并过程又会长时间地占用系统资源,严重制约系统的整体性能。针对此问题,提出了一种面向Cassandra数据库的动态数据管理机制。首先,实时监测系统环境,将数据按照写入时间和大小进行分层分级管理,对合并的时机、参与合并的文件及合并过程分别制定相应的执行策略;其次,通过特定优化手段减少数据的合并时间,以降低合并过程对系统性能的影响。测试结果表明,该管理机制优化了Cassandra数据库的合并过程,提升了系统整体的读取响应速度。

关键词: Cassandra数据库,动态数据管理,合并策略,读取响应速度

Abstract: Cassandra is one of the universal databases,and it’s also specified as the top level project by the Apache.For the Cassandra distributed database system,a large number of write requests will cause excessive and dispersed SStable structures and high data redundancy,causing low efficiency to the user read requests.This problem can be solved by the local data consolidation mechanism triggered automatically by the system or by the overall data consolidation mechanism triggered by the human intervention.However,on one hand,the irrational timing automatical partial merger process will seriously reduce the performance of the read operation requested by the user;on the other hand,the long-time human overall data consolidation process will occupy a large number of system resources,which will severely restrict the overall performance of the corresponding system.To solve this problem,we presented an efficient and dynamic management mechanism.Firstly,appropriate implementation strategies are developed to the time of the merger,the file involved in the merger and the merge process by monitoring system environment and managing the data according to the time and size.Secondly,the impact of the consolidation process on system performance is reduced by reducing the data combination time through specific optimization methods.The final result shows that this data management system optimizes the Cassandra database consolidation process and ultimately enhances the response speed for the read request.

Key words: Cassandra database,Dynamic data management,Consolidation strategy,Response speed for the read request

[1] Ferdman M,Adileh A,Kocberber O,et al.Clearing the clouds:a study of emerging scale-out workloads on modern hardware[J].ACM SIGARCH Computer Architecture News,2012,40(1):37-48
[2] Lotfi-Kamran P,Grot B,Ferdman M,et al.Scale-out processors[J].IEEE Computer Society ACM SIGARCH Computer Architecture News,2012,40(3):500-511
[3] First the tick,now the tock:Next generation Intel microarchitecture (Nehalem).http://www.bitpipe.com/detail/RES/123871608_708.html
[4] Rabl T,Sadoghi M,Jacobsen H A,et al.Solving Big Data Challenges for Enterprise Application Performance Management[J].PVLDB,2012,5(12):1724-1735
[5] DeCandia G,Hastorun D,Jampani M,et al.Dynamo:Amazon’s Highly Available Key-Value Store[J].ACM Sigops Oper.Syst.rev,2007,1(6):205-220
[6] Cartell R.Scalable SQL and NoSQL data stores[J].ACM Sigmod Record,2010,9(4):12-27
[7] Nguyen T T,Nguyen M H.Zing Database:high-performancekey-value store for large-scale storage service[J].Vietnam Journal of Computer Science,2015,2(1):13-23
[8] The Apache Cassandra Project.http://cassandra.apache.org
[9] Chen C,Hsiao M.Bigtable:A distributed storage system forstructured data[J].Proceedings of Osdi,2006,26(2):205-218
[10] Cooper B F,Silberstein A,Tam E,et al.Benchmarking cloud serving systems with YCSB[C]∥SoCC.2010:143-154
[11] Bridges J T,Dieffenderfer J N,Sartorius T,et al.Caching memory attribute indicators with cached memory data field[P].US,US20070094475 A1,2005
[12] Spillane R P,Shetty P J,Zadok E,et al.An efficient multi-tier tablet server storage architecture[C]∥Acm Symposium on Cloud Computing Acm.2011:1-14

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!