Computer Science ›› 2013, Vol. 40 ›› Issue (6): 164-171.

Previous Articles     Next Articles

Distributed Skyline Processing Based on Hypersphere Projection Partitioning on Cloud Environments

LEI Ting,WANG Tao,QU Wu and HAN Xiao-guang   

  • Online:2018-11-16 Published:2018-11-16

Abstract: Recently,skyline processing has been receiving considerable attention due to its potential applications in many fields,including traditional database,distributed database,data stream and even the categorical database and so on.Both the academic and the industrial have paid much attention to it.As an important data mining technique,skyline proces-sing is of great significance for multi-objective optimization,urban navigation,multi-criteria decision making and prefe-rence query,trip planning,defense and intelligence systems and geographic information systems.In addition,the amount of data collected and used by human is developing at an astonishing speed.Therefore,how to process Skyline query of massive data is an urgent problem.Aiming at cloud computing applications,this paper designed and implemented distributed Skyline processing based on hypersphere projection partitioning under the Map-Reduce framework,HSPD-Skyline.It is showed that partitioning the data according to the hyperspherical coordinates can increase the average pruning power of points within a partition,and reduce the cost of Skyline processing.The HSPD-Skyline algorithm also uses a heuristic strategy based on space partitioning tree,HA-SPT,to further improve the processing efficiency of the HSPD-Skyline algorithm.Finally,the theoretical analysis and experiment results illustrate that the HSPD-Skyline algorithm (Distributed Skyline Processing based on Hypersphere Projection Partitioning) consistently outperforms similar approaches for distributed skyline computation,regardless of data distribution,and further optimization strategies.

Key words: Distributed Skyline processing,Map-Reduce frame,Partitioning strategy,HSPD-Skyline

[1] Kung H T,Luccio F,Preparata F P.On finding the maxima of a set of vectors[J].Journal of the ACM,1975,22(4):469-476
[2] Borzsonyi S,Kossmann D,Stocker K.The skyline operator[C]∥Proc.of the 17th Int’l Conf.on Data Engineering.Heidelberg,IEEE Computer Society Press,2001:421-430
[3] Chomicki J,Godfrey P,Gryz J,et al.Skyline with presorting[C]∥Proc.of the 19th International Conference on Data Engineering(ICDE 2003).2003:717-816
[4] Tan K-L,Eng P-K,Ooi B C.Efficient progressive skyline computation[C]∥Proc.of the 27th International Conference on Very Large Data Bases(VLDB 2001).2001:301-310
[5] Godfrey P,Shipley R,Gryz J.Maximal vector computation inlarge data sets[C]∥Proc.of the 31st international conference on Very large data bases(VLDB 2005).2005:229-240
[6] Kossmann D,Ramsak F,Rost S.Shooting stars in the sky:an online algorithm for skyline queries[C]∥Proceedings of the 28th International Conference on Very Large Data Bases.2002:275-286
[7] 周红福,宫学庆,郑凯,等.基于高维空间的在线高效子空间Skyline算法-CSky[J].计算机学报,2007,30(8):1409-1417
[8] 程文聪,邹鹏,贾焰,等.基于小波概要的区间差分skyline研究[J].计算机科学,2010,7(11):160-165,2
[9] 付世昌,董一鸿,陈华辉,等.基于道路网络不确定移动对象的连续概率Skyline查询[J].计算机科学,2011,8(7):152-156
[10] Balke W-T,Güntzer U,Zheng J X.Efficient distributed skylining for web information systems[C]∥Proc.of the 9th International Conference on Extending Database Technology(EDBT 2004).2004:256-273
[11] Huang Zhi-yong,Jensen C S,Lu Hua,et al.Skyline queries a-gainst mobile lightweight devices in manets[C]∥Proc.of the 22nd International Conference on Data Engineering(ICDE 2006).2006:66-77
[12] Lo E,Yip K Y,Lin K-I,et al.Progressive skylining over web-accessible databases[J].Data & Knowledge Engineering,2006,57(2):122-147
[13] Vlachou A,Doulkeridis C,Kotidis Y,et al.Skypeer:Efficient subspace skyline computation over distributed data[C]∥Proc.of the IEEE 23rd International Conference on Data Enginering(ICDE 2007).2007:416-425
[14] Jagadish H V,Ooi B C,Vu Q H.Baton:a balanced tree structure for peer-to-peer networks[C]∥Proc.of the 31st international conference on Very large data bases(VLDB 2005).2005:661-672
[15] Wu Ping,Zhang Cai-jie,Feng Ying,et al.Parallelizing skyline queries for scalable distribution[C]∥Proc.of the 10th International Conference on Extending Database Technology(EDBT 2006).2006:112-130
[16] Gao Yun-jun,Chen Gen-cai,Chen Ling,et al.Parallelizing progressive computation for skyline queries in multi-disk environment[C]∥Proc.of the 17th International Conference on the Database and Expert Systems Applications(DEXA 2006).2006:697-706
[17] Ratnasamy S,Francis P,Handley M,et al.A scalable content addressable network[C]∥Proc.of the 2001Conference on Applications,Technologies,Architectures,and Protocols for Computer Communications(SIGCOMM 2001).2001:161-172
[18] Pei Jian,Yuan Yi-dong,Lin Xue-min,et al.Towards multidi-mensional subspace skyline analysis[J].ACM Trans.on Database Systems (TODS),2006,31(4):1335-1381
[19] DeWitt D,Gray J.Parallel database systems:the future of high performance database systems[J].Commun.ACM,1992,35(6):85-98
[20] Stonebraker M,Aoki P M,Litwin W,et al.Mariposa:a wide-areadistributed database system[J].The VLDB Journal,1996,5(1):048-063
[21] Dehne F,Fabri A,Rau-Chaplin A.Scalable parallel geometric algorithms for coarse grained multicomputers[C]∥Proc.of the Ninth Annual Symposium on Computational Geometry(SCG 1993).1993:298-307
[22] Cosgaya-Lozano A,Rau-Chaplin A,Zeh N.Parallel computation of skyline queries[C]∥Proc.of the 21st International Symposiumon High Performance Computing Systems and Applications(HPCS 2007).2007,12
[23] Kung H T T,Luccio F L,Preparata F P.On finding the maxima of a set of vectors[J].Journal of the ACM (JACM),1975,22(4):469-476
[24] Papadias D,Tao Yu-fei,Fu G,et al.An optimal and progressive algorithm for skyline queries[C]∥Proc.of the 2003ACM SIGMOD International Conference on Management of Data(SIGMOD 2003).2003:467-478
[25] Huang Zhi-yong,Jensen C S,Lu Hua,et al.Skyline queries a-gainst mobile lightweight devices in manets[C]∥Proc.of the 22nd International Conference on Data Engineering(ICDE 2006).2006,66
[26] Wang Shi-yuan,Ooi B C,Tung A K H.Efficient skyline query processing on peer-to-peer networks[C]∥Proc.of the IEEE 23rd International Conference on Data Enginering(ICDE 2007).2007:1126-1135
[27] Cosgaya-Lozano A,Rau-Chaplin A,Zeh N.Parallel computation of skyline queries[C]∥Proc.of Int.Symp.on High Performance Computing Systems and Applications.2007,12
[28] Wu P,Zhang C,Feng Y,et al.Parallelizing skyline queries for scalable distribution[C]∥Proc.of Conf.on Extending Database Technology (EDBT).2006:112-130
[29] Wang S,Ooi B C,Tung A K H,et al.Efficient skyline queryprocessing on peer-to-peer networks[C]∥Proc.of Int.Conf.on Data Engineering (ICDE).2007:1126-1135
[30] Dean J,Ghemawat S.MapReduce:simplified data processing on large clusters[J].Commun.ACM,2008,51:107-113
[31] 丁琳琳,信俊昌,王国仁,等.基于Map-Reduce的大数据高效Skyline查询处理[J].计算机学报,2011,34(10):1785-1796
[32] 张波良,周水庚,关佶红.MapReduce框架下的Skyline计算[J].计算机科学与探索,2011,5(5):385-397
[33] Vlachou A,Doulkeridis C,Kotidis Y.Angle-based space parti-tioning for efficient parallel skyline computation[C]∥SIGMOD.2008:227-238
[34] Khler H,Yang Jing,Zhou Xiao-fang.Efficient parallel skyline processing using hyperplane projections[C]∥Proceedings of the 2011International Conference on Management of Data.2011:12-16

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!