计算机科学 ›› 2014, Vol. 41 ›› Issue (9): 104-109.doi: 10.11896/j.issn.1002-137X.2014.09.020

• 网络与通信 • 上一篇    下一篇

面向云环境的集群资源模糊聚类划分算法的优化

董世龙,陈宁江,谭瑛,何子龙,朱莉蓉   

  1. 广西大学计算机与电子信息学院 南宁530004;广西大学计算机与电子信息学院 南宁530004;广西大学计算机与电子信息学院 南宁530004;广西大学计算机与电子信息学院 南宁530004;广西大学计算机与电子信息学院 南宁530004
  • 出版日期:2018-11-14 发布日期:2018-11-14
  • 基金资助:
    本文受国家自然科学基金(61063012,61363003),广西自然科学基金项目(2012GXNSFAA053222),广西高校优秀人才资助

Optimization of Cluster Resource Fuzzy Clustering Partition Algorithm for Cloud Computing

DONG Shi-long,CHEN Ning-jiang,TAN Ying,HE Zi-long and ZHU Li-rong   

  • Online:2018-11-14 Published:2018-11-14

摘要: 传统的串行模糊聚类分析算法在应对高维矩阵运算时存在运算量大、运算效率低等问题,难以满足云环境中集群资源调度的时效性要求。为此,在基于等价关系的模糊聚类算法基础上对传递闭包法进行优化,提出一种基于多线程的云资源模糊聚类划分并发算法,并将其应用于Hadoop调度器的策略改进。仿真实验结果表明,优化策略有助于减少平方法求解模糊等价矩阵的计算量,所设计的并发算法能够有效解决中小规模云集群资源聚类的运算瓶颈问题,且具有较好的加速比。为了解决现有Hadoop调度器存在的异构性问题,对该优化并发算法进行了理论分析,结果表明它有助于解决异构性带来的调度难题。

关键词: 模糊聚类,云计算,资源聚类,模糊等价矩阵,Hadoop

Abstract: The classic fuzzy clustering serial algorithm has the problems of heavy computation and low efficiency in dealing with high dimensional matrix operations,so it can’t be applied effectively into the fuzzy clustering partition model in cloud computing environment,and it’s hard to meet the time efficiency requirement of resource scheduling.Therefore,a transitive closure method was optimized based on the equivalence relation-based fuzzy clustering algorithm.What’s more,a fuzzy clustering concurrent algorithm based on multi-threading for cloud resources was applied to the improvement strategies for Hadoop scheduler.The experimental results indicate that the optimization strategy can reduce the computation for solving square-based fuzzy equivalent matrix problem.Moreover,the concurrent algorithm can effectively solve the computation bottleneck of resource clustering on small and medium-sized clusters,and it has a better speed-up ratio.To solve the problem of heterogeneity which exists in the existing Hadoop schedulers,theoretical analyses of the concurrent optimization algorithm show that it can help to solve scheduling problems caused by heterogeneity.

Key words: Fuzzy clustering,Cloud computing,Resources clustering,Fuzzy equivalence matrix,Hadoop

[1] Ji Chang-qing,Li Yu,Qiu Wen-ming,et al.Big data processing in cloud computing environments [C]∥Proceedings of the 12th International Symposium on Pervasive Systems.Algorithms and Networks,2012:17-23
[2] Gao Zhong-wen,Zhang Kai.The research on cloud computingresource scheduling method based on time-cost-trust model [C]∥Proceedings of the 2nd International Conference on Computer Science and Network Technology.2012:939-942
[3] Hadoop.Open-source software for reliable,scalable,di- stribut-ed computing[EB/OL].http://hadoop.apache.org,2011
[4] Dean J,Ghemawat G.MapReduce:simplified data processing on large clusters[J].Communications of the ACM 50th anniversary issue,2008,51(1):107-113
[5] Zahafia M,Konwinski A,Joseph A,et al.Improving MapReduce Performance in Heterogeneous Environ- ments [C]∥Procee-dings of the 8th Usenix Symp on Operating Systems Design and Implementation.2008:29-42
[6] Raju G,Thomas B,Tobgay S,et al.Fuzzy clustering methods in data mining:A comparative case analysis [C]∥Proceedings of the 2008 International Conference on Advanced Computer Theoryand Engineering.2008:489-493
[7] Wang Zhongyuan,Qi Qing-wen,Xu Li.Cluster analysis based on spatial feature selecting in spatial data mining [C]∥Proceedings of the 2008 International Conference on Computer Science and Software Engineering.2008:386-389
[8] Yang Jiann-min,Wu Wen-chin,Liao Wei-cheng,et al.Trend analysis of machine learning-A text mining and document clustering methodology [C]∥Proceedings of the 2009 International Conference on New Trends in Information and Service Science.2009:481-486
[9] 梁保松,曹殿立.模糊数学及其应用[M].北京:科学出版社,2007
[10] 李文娟,张启飞,平玲娣,等.基于模糊聚类的云任务调度算法[J].通信学报,2012,33(3):146-154
[11] Sun Da-wei,Chang Gui-ran,Jin Li-zhong,et al.Optimizing grid resource allocation by combining fuzzy clustering with application preference [C]∥Proceedings of the 2nd IEEE International Conference on Advanced Computer Control.2010:22-27
[12] 陈艳金.MapReduce模型在Hadoop平台下实现作业调度算法的研究和改进[D].广州:华南理工大学,2011
[13] 夏祎.Hadoop平台下的作业调度算法研究与改进[D].广州:华南理工大学,2010
[14] 杜晓丽,蒋昌俊,徐国荣.一种基于模糊聚类的网格DAG任务图调度算法[J].软件学报,2006,17(11):2277-2288
[15] Luo Mei,Zhang Kai-long,Yao Long-hui,et al.Research on resources scheduling technology based on fuzzy clustering analysis [C]∥Proceedings of the 9th International Conference on Fuzzy Systems and Knowledge Discovery.2012:152-155
[16] Li Wen-juan,Zhang Qi-fei,Wu Ji-yi,et al.Trust-based and QoS demand clustering analysis customizable cloud workflow scheduling strategies [C]∥Proceedings of the 2012 IEEE International Conference on Cluster Computing Workshops.2012:111-119
[17] Li Fu-fang,Qi De-yu.Research on grid resource alloca- tion algorithm based on fuzzy clustering [C]∥Proceedings of the 2nd International Conference on Future Generation Communication and Networking.2008:162-166
[18] 刘伯成,陈庆奎.云计算中的集群资源模糊聚类划分模型[J].计算机科学,2011,38(10A):157-160
[19] 那丽春.集群资源模糊聚类划分模型[J].计算机工程,2012,38(6):34-36
[20] Sejun K,Donald C.A GPU based parallel hierarchical fuzzyART clustering [C]∥Proceedings of the 2011 International Joint Conference on Neural Networks.2011:2778-2782
[21] Zhao Wei-zhong,Ma Hui-fang,He Qing.Parallel kmeans clustering based on mapreduce [C]∥Proceedings of the 1st International CloudCom Conference.2009:674-679
[22] Aleksandar S,Sebastian M,Ralf S.RankReduce-proce- ssing k-nearest neighbor queries on top of mapreduce [C]∥Proceedings of the 8th Workshop on Large-Scale Distributed Systems for Information Retrieval.2010:13-18
[23] 江小平,李成华,向文,等.k-means聚类算法的MapReduce并行化实现[J].华中科技大学学报:自然科学版,2011,39:120-124

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!