Computer Science ›› 2016, Vol. 43 ›› Issue (6): 35-38.doi: 10.11896/j.issn.1002-137X.2016.06.007

Previous Articles     Next Articles

MapReduce-based Skyline Query Processing Algorithm

CUI Wen-xiang, XIAO Ying-yuan, HAO Gang, WANG Hong-ya and DENG Hua-feng   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Skyline query is a typical multi-objective optimization problem and is widely applied in multi-objective optimization,data mining and other fields.Most of the existing Skyline query processing algorithms assume that the data set is placed in a single server,and query processing algorithm is designed as a serial algorithm for a single server.With the rapid growth of data,especially under the background of big data,the traditional serial Skyline algorithms based on a single computer are far from enough to meet the needs of users.Based on the popular distributed parallel programming framework MapReduce,this paper studied the parallel skyline query algorithm suitable for large data sets.Aiming at the factors affecting MapReduce,this paper improved the existing data partition strategy based on angles and proposed the data partition strategy based on Balanced Angular.Meanwhile,to reduce the computation of Reduce phase,this paper proposed the data filtering strategy in advance at Map.The experimental results show that the proposed Skyline query algorithm can improve system performance significantly.

Key words: MapReduce,Skyline,Data partition

[1] Balke W T,Güntzer U.Multi-objective query processing for database systems[C]∥Int’l Conference on Very Large Data Bases (VLDB).2004:936-947
[2] Wei Xiao-juan,Yang Jing,Li Cui-ping,et al.Skyline query[J].Journal of Software,2008,9(6):1386-1400(in Chinese) 魏小娟,杨婧,李翠平,等.Skyline查询处理[J].软件学报,2008,9(6):1386-1400
[3] Kung H T,Luccio F,Preparata F P.On finding the maxima of a set of vectors[J].Journal of ACM,1975,2(4):469-476
[4] Preparata F P,Shamos M I.Computational Geometry:An introduction[M].New York:Springer-Verlag,1985
[5] Brzsnyi S,Kossmann D,Stocker K.The skyline operator[C]∥IEEE International Conference on Data Engineering (ICDE).2001:421-430
[6] Chomicki J,Godfrey P,Gryz J,et al.Skyline with presorting[C]∥IEEE International Conference on Data Engineering (ICDE).2003:717-816
[7] Godfrey P,Shipley R,Gryz J,et al.Maximal vector computation in large data sets[C]∥Int’l Conference on Very Large Data Bases (VLDB).2005:229-240
[8] Bartolini I,Ciaccia P,Patella M,et al.Efficient sort-based skyline evaluation[J].ACM Transactions on Database Systems (TODS),2008,3(4):1-45
[9] Tan K L,Eng P K.Efficient progressive skyline computation[C]∥IEEE International Conference on Data Engineering (ICDE).2001:301-310
[10] Papadias D,Tao Y.An optimal and progressive algorithm forskyline queries[C]∥ACM SIGMOD Int’l.Conference on Ma-nagement of Data (SIGMOD).2003:467-478
[11] Papadias D,Tao Y.Progressive skyline computation in database systems[J].ACM Transactions on Database Systems (TODS),2005,30(1):41-82
[12] Dean J,Ghemawat S.MapReduce:Simplified Data Processing on Large Clusters[J].Communications of the ACM,2008,51(1):107-113
[13] Zhang Bo-liang,Zhou Shui-geng,Guan Ji-hong.Skyline under Map-Reduce framework[J].Computer Science and Exploration,2011,5(5):385-397(in Chinese) 张波良,周水庚,关佶红.MapReduce框架下的Skyline计算[J].计算机科学与探索,2011,5(5):385-397
[14] Chen L,Hwang K,Wu J.MapReduce Skyline Query Processing with A New Angular Partitioning Approach[C]∥Proc.of International Parallel and Distributed Processing Symposium.2012:2262-2270

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!