Computer Science ›› 2015, Vol. 42 ›› Issue (11): 65-67.doi: 10.11896/j.issn.1002-137X.2015.11.013

Previous Articles     Next Articles

MapReduce Parallel Model Based on Tree Structure

TANG Bing and HE Hai-wu   

  • Online:2018-11-14 Published:2018-11-14

Abstract: MapReduce is a distributed computing model introduced by Google,which has been widely used in the field of massive data processing.A novel MapReduce parallel model was presented in this paper.The model is suitable for massive scientific data analysis,using unreliable desktop PC resources in the Internet or Intranet environment.Computing nodes are organized in the form of P2P,and the P2P-MPI framework is utilized in the lower layer,while message pas-sing interface model is utilized to achieve the MapReduce application layer.In the implementation of MapReduce application layer,the way of broadcast is used to distribute data chunks in the Map stage,and an inverse binary tree is constructed to realize effective intermediate results reduction in the Reduce stage.The proposed MapReduce mode was compared with existing popular MapReduce modes.The results show that the proposed tree structure-based MapReduce parallel model has a good performance in terms of fault-tolerance and it is simple and easy for application development.

Key words: MapReduce,Tree structure,Binary tree,Message passing interface(MPI)

[1] Dean J,Ghemawat S.MapReduce:Simplified Data Processing on Large Clusters[J].Communications of the ACM,2008,51(1):107-113
[2] Anderson D P.BOINC:A System for Public-Resource Computing and Storage[C]∥Proc.of the 5th International Workshop on Grid Computing (GRID 2004).2004:4-10
[3] Cappello F,Djilali S,Fedak G,et al.Computing on Large-scale Distributed Systems:XtremWeb Architecture,Programming Models,Security,Tests and Convergence with Grid[J].Future Generation Computer Systems,2005,21(3):417-437
[4] Litzkow M J,Livny M,Mutka M W.Condor-A Hunter of Idle Workstations[C]∥Proc.of the 8th International Conference on Distributed Computing Systems (ICDCS 1988).1988:104-111
[5] Lin H,Ma X,Feng W.Reliable MapReduce Computing on Opportunistic Resources[J].Cluster Computing,2012,15(2):145-161
[6] Marozzo F,Talia D,Trunfio P.P2P-Mapreduce:parallel dataprocessing in dynamic cloud environments[J].Journal of Computer and System Sciences,2012,78(5):1382-1402
[7] Costa F,Silva J N,Veiga L,et al.Large-scale volunteer computing over the Internet[J].Journal of Internet Services and Applications,2012,3(3):329-346
[8] Tang B,Moca M,Chevalier S,et al.Towards mapreduce fordesktop grid computing[C]∥Proc.of the 5th International Conference on P2P,Parallel,Grid,Cloud and Internet Computing (3PGCIC 2010).2010:193-200
[9] Lu L,Jin H,Shi X,et al.Assessing mapreduce for Internet computing:a comparison of Hadoop and BitDew-MapReduce[C]∥Proc.of the 13th ACM/IEEE International Conference on Grid Computing (GRID 2012).2012:76-84
[10] Genaud S,Rattanapoka C.P2P-MPI:A Peer-to-Peer Framework for Robust Execution of Message Passing Parallel Programs on Grids[J].Journal of Grid Computing,2009,5(1):27-42
[11] Genaud S,Rattanapoka C.A Peer-to-Peer Framework for Message Passing Parallel Programs[M]∥Xhafa F,eds.Parallel Programming,Models and Applications in Grid and P2P Systems,Advances in Parallel Computing.IOS Press,2009:118-147
[12] Carpenter B,Getov V,Judd G,et al.MPJ:MPI-like messagepassing for Java[J].Concurrency-Practice and Experience (CONCURRENCY),2000,12(11):1019-1038
[13] Fedak G,He H,Cappello F.BitDew:A Data Management and Distribution Service with Multi-protocol File Transfer and Metadata Abstraction[J].Journal of Network and Computer Applications,2009,32(5):961-975

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!