计算机科学 ›› 2015, Vol. 42 ›› Issue (11): 48-52.doi: 10.11896/j.issn.1002-137X.2015.11.008

• 2014年全国高性能计算机学术年会 • 上一篇    下一篇

大数据负载的体系结构特征分析

罗建平,谢梦瑶,王华锋   

  1. 中国科学院计算技术研究所先进计算机系统研究中心 北京100190;北京航空航天大学软件学院 北京10091,中国科学院计算技术研究所先进计算机系统研究中心 北京100190;郑州大学信息工程学院 郑州450001,北京航空航天大学软件学院 北京10091
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家重点基础研究发展规划项目(2014CB340402),国家自然科学基金(61303054)资助

Analysis of Architecture Characteristics of Big Data Workloads

LUO Jian-ping, XIE Meng-yao and WANG Hua-feng   

  • Online:2018-11-14 Published:2018-11-14

摘要: 针对大数据离线分析类和交互式查询类负载,首先对这些负载的一些共性进行分析,提取出公共操作集,并对它们进行分组整理;然后在大数据平台上测试这些负载运行过程中的微体系结构特征,采用PCA和SimpleKMeans算法对这些体系结构特征参数进行降维和聚类处理。实验分析结果表明负载之间有公共的操作集,如Join和Cross Production;有些负载有相似的属性,如Difference和Projection共享相同的微体系结构特征。实验结果对于 处理器等硬件平台的设计以及应用程序的优化具有指导性的意义,并且为大数据基准测试平台的设计提供了参考。

关键词: 大数据,大数据负载,体系结构特征

Abstract: Aiming at the big data workloads with off-line analysis and interactive queries,first we analyzed some common features of these workloads,extracted the common set of operations and arranged the workloads in groups.Then,we tested the big data workloads on the BigDataBench platform and got the micro-architecture characteristics using PCA and SimpleKMeans algorithm for dimensionality reduction and clustering analysis.Our study revealed that big data workloads share a common set of operations such as the Join and Cross Production.We also observed that some of the big data workloads have many similar features.For example,the Difference and Projection operations share micro-architectural characteristics.The result of our experiment has a guiding significance for the design of hardware platforms like processors and the optimization of applications.Meanwhile,it also provides valuable insights into the implementation of the big data benchmark platform.

Key words: Big data,Big data workloads,Architecture characteristic

[1] Wen Xiong,Yu Zhi-bin,Bei Zhen-dong,et al.A characterization of big data benchmarks[C]∥2013 IEEE International Confe-rence on Big Data.2013:118-125
[2] Gao Wan-ling,Zhu Yu-qing,Jia Zhen,et al.BigDataBench:a Big Data Benchmark Suite fromWeb Search Engines[C]∥The Third Workshop on Architectures and Systems for Big Data ( ASBD 2013 ) in Conjunction with The 40th International Symposium on Computer Architecture.2013
[3] Wang Lei,Zhan Jian-feng,Luo Chun-jie,et al.BigDataBench:A big data benchmark suite from internet services[C]∥2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA).2014:488-499
[4] Jia Zhen,Wang Lei,Zhan Jian-feng,et al.Characterizing data analysis workloads in data centers[C]∥2013 IEEE International Symposium on Workload Characterization (IISWC).IEEE,2013:66-76
[5] White T.Hadoop:The Definitive Guide(Second Edition)[M].San Francisco:O’Reilly Media,2011
[6] Han Jia-wei,Kamber M,Pei Jian.Data Mining:Concepts andTechniques(Third Edition)[M].San Francisco:Elsevier,2012
[7] ICTBench.http://prof.ict.ac.cn/ICTBench/
[8] Shark.http://shark.cs.berkeley.edu/
[9] Spark.http://spark.apache.org/
[10] Hive.http://hive.apache.org/
[11] 冯琳.集群计算引擎Spark中的内存优化研究与实现[D].北京:清华大学,2013 Feng Lin.Research and Implementation ofMemory Optimization Based on Parallel Computing Engine Spark[D].Beijing:Tsinghua University,2013
[12] 赵龙,江荣安.基于 Hive 的海量搜索日志分析系统研究[J].计算机应用研究,2013,30(11):3343-3345 Zhao Long,Jiang Rong-an.Research of massive searching logs analysis system based on Hive[J].Application Research of Computers,2013,30(11):3343-3345
[13] 叶文宸.基于hive的性能优化方法的研究与实践[D].南京:南京大学,2011 Ye Wen-chen.The Research and Practice of Performance Optimization Based on Hive[D].Nanjing:Nanjing University,2011
[14] 唐振坤.基于Spark的机器学习平台设计与实现[D].厦门:厦门大学,2014 Tang Zheng-kun.Design and Implementation of Machine Learning Platform Based on Spark[D].Xiamen:Xiamen University,2014
[15] 刘记云.基于MapReduce的个性化PageRank算法研究[D].哈尔滨:哈尔滨工程大学,2013 Liu Ji-yun.A Research on Personalized PageRank Based on MapReduce[D].Harbin:Harbin Engineering University,2013
[16] 李林.基于Hadoop平台的视觉数据聚类研究与实现[D].西安:西安电子科技大学,2013 Li Lin.Research and Implementation of Clustering on Visual Data Based on Hadoop[D].Xi’an:XiDian University,2013
[17] 黄永兵,陈明宇.移动设备应用程序的体系结构特征分析[J].计算机学报,2015:38(2):386-396 Huang Yong-bing,Chen Ming-yu.Architecture Characteristics and Analysis of Mobile Device Applications[J].Chinese Journal of Computers,2015,8(2):386-396

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!