Computer Science ›› 2018, Vol. 45 ›› Issue (3): 171-177.doi: 10.11896/j.issn.1002-137X.2018.03.027

Previous Articles     Next Articles

In-memory B+tree Construction Methodology for Big Data Stream

YANG Liang-huai, XIANG Jun-jian, XU Wei and FAN Yu-lei   

  • Online:2018-03-15 Published:2018-11-13

Abstract: This paper investigated into the issues of indexing on data stream with time dimension in near real-time.By resorting to 2-tier B+tree index,this paper invented a highly effective in-memory B+tree construction method for scenarios with real-time query requirements,which separates as many parallelizing operations as possible.This paper parallelized the operations of sorting and data receiving by dividing the time-window into equal-duration slice,and parallelized the construction of B+tree skeleton with sorting.This paper avoided unnecessary locking and synchronizing cost by adopting sorting-based bulk loading techniques and optimized constructing sequence.The proposed in-memory B+tree construction algorithm called MBSortSBLoad can build B+tree quickly and accept higher data arriving rates.Extensive experiments demonstrate the effectiveness of the proposed methods.

Key words: B+tree,Data stream,In-memory index,Big data

[1] BABCOCK B,BABU S,DATAR M, et al.Models and issues in data streams[C]∥Proceedings of PODS Symposium.2002:1-16.
[2] ARASU A,BABCOCK B,BABU S,et al.Stream:The Stanford data stream management system[M].Springer:Data Stream Management,2016:317-336.
[3] CHANDRASEKARAN S,COOPER O,DESHPANDE A,et al.TelegraphCQ:Continuous dataflow processing[C]∥ Acm Sigmod International Conference on Management of Data.2003:668.
[4] CARNEY D,CETINTERNEL U,CHERNIACK M,et al.Monitoring streams-a new class of DBMS applications[C]∥Proceedings of VLDB Conference.2002:215-226.
[5] ZHANG D D,LI J Z,WANG W P,et al.Algorithms for Storing and Aggregating Historical Streaming Data[J].Journal of Software,2005,16(12):2089-2098.(in Chinese) 张冬冬,李建中,王伟平,等.数据流历史数据的存储与聚集查询处理算法[J].软件学报,2005,16(12):2089-2098.
[6] FUSCO F,VLACHOS M,STOECKLIN M P.Real-time crea-tion of bitmap indexes on streaming network data[J].The VLDB Journal,2012,21(3):287-307.
[7] PU K Q,ZHU Y.Efficient indexing of heterogeneous datastreams with automatic performance configurations[C]∥Proceedings of Scientific and Statistical Database Management Conference.2007:34-34.
[8] JIN C Q,QIAN W N,ZHOU A Y.Analysis and Management of Streaming Data:A Survey[J].Journal of Software,2004,15(8):1172-1179.(in Chinese) 金澈清,钱卫宁,周傲英.流数据分析与管理综述[J].软件学报,2004,15(8):1172-1179.
[9] LU H,NG Y Y,TIAN Z.T-tree or B-tree:Main memory database index structure revisited[C]∥Proceedings of Australian Database Conference.2000:65-73.
[10] BERCKEN J,SEEGER B.An evaluation of generic bulk loading techniques[C]∥Proceedings of VLDB Conference.2001:461-470.
[11] CIACCIA P,PATELLA M.Bulk loading the M-tree[C]∥Proceedings of Australasian Database Conference.1998:15-26.
[12] LO M L,RAVISHANKAR C V.The design and implementation of seeded trees:An efficient method for spatial joins[J].IEEE TKDE,1998,10(1):136-152.
[13] ARGE L,HINRICHS K H,et al.Efficient bulk operations on dynamic R-trees[J].Algorithmica,2002,33(1):104-128.
[14] JERMAINE C,DATTA A,O MIECINSKI E.A novel index supporting high volume data warehouse insertion[C]∥Proceedings of VLDB Conference.1999:235-246.
[15] KHOLGHI M,KEYVANPOUR M R.Comparative evaluation of data stream indexing models[J].International Journal of Machine Learning and Computing,2012,2(3):257-260.
[16] SHIVAKUMAR N,GARCIA-MOLINA H.Wave-indices:indexing evolving databases[C]∥Proceedings of SIGMOD Confe-rence.1997:381-392.

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!