计算机科学 ›› 2015, Vol. 42 ›› Issue (12): 130-135.

• 第十三届全国软件与应用学术会议 • 上一篇    下一篇

基于动态镜像的实时数据仓库存取预处理技术研究

毛莺池,闵 伟,接 青,朱沥沥   

  1. 河海大学计算机与信息学院 南京211100;河海大学淮安研究院 淮安223001,河海大学计算机与信息学院 南京211100,河海大学计算机与信息学院 南京211100,河海大学计算机与信息学院 南京211100
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金项目(61272543,U1301252),国家科技支撑计划项目(2013BAB06B04),中国华能集团公司总部科技项目(HNKJ13-H17-04),云南省科技计划项目(2014GA007),中央高校基本科研业务费专项资金(2015B22214)资助

Real-time Data Warehouse Pre-processing Based on Dynamic Mirror Replication

MAO Ying-chi, MIN Wei, JIE Qing and ZHU Li-li   

  • Online:2018-11-14 Published:2018-11-14

摘要: 实时数据仓库是数据仓库技术的重要分支,而实时数据查询和实时数据导入引发的查询竞争问题一直是实时数据仓库技术研究的重点之一。查询竞争问题严重影响了查询分析的精度和效率,还降低了数据仓库的性能。提出了一种在数据仓库外部构建动态存储区域的方法,它采用动态镜像技术,有效地缓解查询竞争问题。同时,为了提高实时OLAP上的查询分析操作的性能,提出了蝇量级物化方法及蝇量级物化下的表连接算法FWMJoin(Fly-Weight Materialization Join)。基于TPC-H基准的实时数据仓库测试系统,针对动态镜像技术下的动态存储区域的OLAP性能进行分析与评估,并对实验结果进行总结。

关键词: 查询竞争,动态镜像,实时数据仓库,联机在线分析

Abstract: Real-time data warehouse is one of the important research field in the data management.Real-time data query and import can bring about the problem of query contention.Query contention will not only seriously affect the accuracy of query analysis,but also reduce the performance of the real-time data warehouse.In this paper,combining an external dynamic storage area,a dynamic mirror replication technology was proposed to effectively solve the query contention problem.Meanwhile,the fly-weight materialization method and the fly-weight materialization join algorithm were proposed to improve the query and analysis performance in the real-time OLAP.Based on the TPC-H benchmark,the proposed dynamic mirror replication technology was evaluated.The experimental results demonstrate the proposed solution can get better performance in terms of effectiveness.

Key words: Query contention,Dynamic mirror replication,Real-time data warehouse,OLAP

[1] Mohammad R,Klanmehr K,Alhajj R,et al.Data warehouse architecture and design[C]∥Proc.of 2008 IEEE Int’l Conf.on Information Reuse and Integration.2008:58-63
[2] Vassiliadis P,Simitsis A.Near real time ETL[M]∥New Trends in Data Warehousing and Data Analysis.2009,3:1-31
[3] White C.Intelligent business strategies:Real-time data ware-housing heats up[D].DM Review,2012
[4] 徐俊刚裴莹.数据ETL研究综述[J].计算机科学,2011,38(4) Xu J,Pei Y,Overview of data extraction,transaction and loading[J].Computer Science,2011,38(4)
[5] Heman S,Zukowski M,Nes N J,et al.Positional update han-dling in column stores[C]∥SIGMOD.2010:543-554
[6] Kuo T W,Kao Y T,Kuo C F.Two-version based concurrency control and recovery in real-time client/server databases[J].IEEE Transaction on Computer,2003,52(4):506-524
[7] Stonebraker M,Abadi D J,Batkin A,et al.C-store:a column-oriented DBMS[C]∥VLDB.2005:553-564
[8] Langseth J.Real-time data warehousing:challenges and solu-tions.http://dssresources.com/papers/features/langse-th/langsth02082004.html
[9] Ankorion I.Change Data Capture-Efficient ETL for Real-Time BI[J].Article published in DM Review Magazine,January 2005
[10] Italiano I C,Ferreira J E.Synchronization Options for DataWarehouse Designs[J].IEEE Computer Magazine,2006,8(4):167-172
[11] Lin Z,Yang D,Song G,et al.Dealing with Query Contention Issue in Real-time Data Warehouses by Dynamic Multi-level Caches[C]∥Seventh International Conference on Computer and Information Technology.2012
[12] TPC-H decision support bench mark,Transaction ProcessingCouncil.www.tpc.com
[13] 朱阅岸,张延松,周烜,等.一个基于三元组存储的列式 OLAP 查询执行引擎[J].软件学报,2014,5(4):753-767 Zhu Y A,Zhang Y S,Zhou X,et al.Column-Oriented query execution engine for OLAP based on triplet[J].Journal of Software,2014,5(4):753-767
[14] Lahman S.The Lahman Baseball database.http://www.baseball.com

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!