计算机科学 ›› 2015, Vol. 42 ›› Issue (8): 225-230.

• 软件与数据库技术 • 上一篇    下一篇

REPS:一种高效的容错并行概率流Skyline查询方法

张卫华,李小勇,马俊,余杰   

  1. 国防科技大学计算机学院 长沙410073,国防科技大学计算机学院 长沙410073,国防科技大学计算机学院 长沙410073,国防科技大学计算机学院 长沙410073
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金项目(61303191,5),国家重点基础研究发展规划(973)项目(2011CB302601)资助

REPS:An Efficient Fault-tolerant Approach for Parallel Skyline Queries over Probabilistic Data Streams

ZHANG Wei-hua, LI Xiao-yong, MA Jun and YU Jie   

  • Online:2018-11-14 Published:2018-11-14

摘要: 概率数据流的并行Skyline查询作为当前大数据分析的一个重要方面,在诸多实际应用中发挥着重要作用。针对并行概率流Skyline查询过程中因发生故障而导致查询结果不准确和查询中断等问题,提出了一种基于复制的容错并行Skyline查询方法REPS。该方法选择参与并行处理的计算节点作为副本节点,并采用层次-循环式数据副本放置策略,选择优先级高的副本恢复数据来保证数据恢复的高效性;同时将故障检测、丢失数据恢复和查询过程恢复贯穿于整个查询更新过程中,以减少容错处理的额外通信和计算开销,并实现快速的容错并行查询。实验结果表明,REPS方法不仅在无故障发生和单个节点失效时具有较高的查询处理效率,而且对于多节点失效情形,仍然能够保持较高的查询处理速率且满足查询需求。

关键词: 概率Skyline,容错查询,数据流,并行查询,大数据

Abstract: The parallel Skyline query over probabilistic data streams,as an important aspect of big data analysis,plays an important role in various applications.To deal with the problem that the query results may be incorrect and interrupted,due to various faults occurred in the process of parallel Skyline queries over probabilistic streams,a replication-based fault-tolerant distributed parallel Skyline query scheme named REPS was proposed.Specifically,the compute nodes are also regarded as the replication nodes,and a layer-alternation strategy for replica placement is proposed in REPS.Thus,the lost data can be recovered efficiently by selecting the replicas with high priority.Moreover,the processes of fault detection,data recovery and query recovery are throughout the whole query updating process,in order to reduce the communication and computation overhead and achieve rapid fault-tolerant parallel query processing.Extensive experimental results demonstrate that REPS method not only has high efficiency when no failure occurs or a single node fails,but also can maintain a high processing rate and meet the query requirement even when multiple failures occur.

Key words: Probabilistic Skyline,Fault-tolerant queries,Data streams,Parallel queries,Big data

[1] Li X,Wang Y,Li X,et al.Parallelizing Skyline Queries over Uncertain Data Streams with Sliding Window Partitioning and Grid Index [J].Knowledge and Information Systems(KAIS),2014,41(2):277-309
[2] 王意洁,李小勇,杨永滔,等.不确定 Skyline 查询技术研究 [J].计算机研究与发展,2012,49(10):2045-2053 Wang Yi-jie,Li Xiao-yong,Yang Yong-tao,et al.Research on uncertain skyline query processing techniques [J].Computer Research and Development,2012,49(10):2045-2053
[3] Chomicki J,Ciaccia P,Meneghetti N.Skyline queries,front and back [J].ACM SIGMOD Record,2013,42(3):6-18
[4] Lu H,Zhou Y,Haustad J.Efficient and scalable continuous skyline monitoring in two-tier streaming settings [J].Information Systems,2013,38(1):68-81
[5] Huang Z,Sun S,Wang W.Efficient mining of skyline objects in subspaces over data streams [J].Knowledge and information systems(KAIS),2010,22(2):159-183
[6] Sun S,Huang Z,Zhong H,et al.Efficient monitoring of skyline queries over distributed data streams [J].Knowledge and Information Systems(KAIS),2010,25(3):575-606
[7] Tao Y,Papadias D.Maintaining sliding window skylines on data streams [J].IEEE Transactions on Knowledge and Data Engineering,2006,18(3):377-391
[8] Lin X,Yuan Y,Wang W,et al.Stabbing the sky:Efficient skyline computation over sliding windows [C]∥Proceedings of the IEEE International Conference on Data Engineering(ICDE).2005:502-513
[9] Morse M,Patel J,Grosky W.Efficient continuous skyline computation [J].Information Sciences,2007,177(17):3411-3437
[10] Zhang Z,Cheng R,Papadias D,et al.Minimizing the communication cost for continuous skyline maintenance [C]∥Proceedings of the ACM International Conference on Management of Data(SIGMOD).2009:495-508
[11] Xin J,Wang G,Chen L,et al.Continuously maintaining sliding window skylines in a sensor network [C]∥Proceedings of the International Conference on Database Systems for Advanced Applications(DASFAA).2007:509-521
[12] Kontaki M,Papadopoulos A N,Manolopoulos Y.ContinuousTop-k Dominating Queries [J].IEEE Transactions on Know-ledge and Data Engineering(TKDE),2012,24(5):840-853
[13] 孙圣力,戴东波,黄震华,等.概率数据流上Skyline查询处理算法 [J].电子学报,2009,37(2):285-293 Sun Sheng-li,Dai Dong-bo,Huang Zhen-hua,et al.Algorithm on computing Skyline over probabilistic data stream [J].Acta Electronica Sinica,2009,37(2):285-293
[14] Zhang W,Lin X,Zhang Y,et al.Probabilistic skylineoperator over sliding windows [C]∥Proceedings of the IEEE International Conference on Data Engineering(ICDE).2009:1060-1071
[15] Lian X,Chen L.Monochromatic and bichromatic reverse skyline search over uncertain data [C]∥Proceedings of the ACM International Conference on Management of Data(SIGMOD).2008:213-226
[16] Ding X,Lian X,Chen L,et al.Continuous monitoring of skylines over uncertain data streams [J].Information Sciences,2012,184(1):196-214
[17] Wang Y,Li X,Li X,et al.A survey of queries over uncertain data [J].Knowledge and Information Systems(KAIS),2013,37(3):485-530
[18] Li X,Wang Y,Li X,et al.Parallel skyline queries over uncertain data streams in cloud computing environments [J].International Journal of Web and Grid Services(IJWGS),2014,10(1):24-53
[19] Dean J,Ghemawat S.MapReduce:Simplified data processing on large clusters [C]∥Proceedings of the USENIX Symposium on Operating System Design and Implementation(OSDI).2004:137-150
[20] MacCormick J,Murphy N,Najork M,et al.Boxwood:Abstractions as the Foundation for Storage Infrastrucure [C]∥Procee-dings of the USENIX Symposium on Operating System Design and Implementation(OSDI).2004:105-120
[21] Wu P,Zhang C,Feng Y,et al.Parallelizing skyline queries for scalable distribution [C]∥ Advances in Database Technology(EDBT):Proceedings of the International Conference on Extending Database Technology.Springer,2006:112-130
[22] Wang S,Ooi B,Tung A,et al.Efficient skyline query processing on peer-to-peer networks [C]∥Proceedings of the IEEE International Conference on Data Engineering(ICDE).2007:1126-1135
[23] 王媛,王意洁,邓瑞鹏,等.云计算环境下的容错并行 Skyline 查询算法研究 [J].计算机科学与探索,2011,5(9):804-814 Wang Yuan,Wang Yi-jie,Deng Rui-peng,et al.Fault-tolerant parallel skyline computation in cloud computing environment [J].Journal of Frontiers of Computer Science and Technology,2011,5(9):804-814

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!