计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 126-129.doi: 10.11896/jsjkx.190900113

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于混合蛙跳算法的K-mediods聚类挖掘与并行优化

魏霖静1, 宁璐璐2, 郭斌3, 侯振兴4, 甘诗润1   

  1. 1 甘肃农业大学信息科学技术学院 兰州730070
    2 陕西科技大学轻工科学与工程学院 西安710021
    3 河海大学计算机与信息学院 南京210094
    4 南京大学信息管理学院 南京210093
  • 收稿日期:2019-09-16 修回日期:2019-10-21 出版日期:2020-10-15 发布日期:2020-10-16
  • 通讯作者: 魏霖静(wlj@gsau.edu.cn)
  • 基金资助:
    甘肃农业大学学科建设专项项目(GAU-XKJS-2018-257);甘肃农业大学青年导师扶持基金项目(GAU-QDFC-2018-13);兰州市科技计划项目(2019-1-31);甘肃省教育厅创新基金(2020B-122);国家自然科学基金(61063028,31560378)

K-mediods Cluster Mining and Parallel Optimization Based on Shuffled Frog Leaping Algorithm

WEI Lin-jing1, NING Lu-lu2, GUO Bin3, HOU Zhen-xing4, GAN Shi-run1   

  1. 1 School of Information Science & Technology,Gansu Agriculture University,Lanzhou 730070,China
    2 School of Light Industry Science & Engineering,Shaanxi University of Science and Technology,Xi’an 710021,China
    3 College of Computer and Information,Hohai University,Nanjing 210094,China
    4 School of Information Management,Nanjing University,Nanjing 210093,China
  • Received:2019-09-16 Revised:2019-10-21 Online:2020-10-15 Published:2020-10-16
  • About author:WEI Lin-jing,born in 1977,Ph.D,professor,master supervisor,is a member of China Computer Federation.Her main research interests include intelligent computing,algorithm application research and image analysis.
  • Supported by:
    Discipline Construction Project of Gansu Agricultural University (GAU-XKJS-2018-257),Youth Tutor Support Fund Project of Gansu Agricultural University (GAU-QDFC-2018-13),Lanzhou Science and Technology Plan Project (2019-1-31),Gansu Provincial Department of Education Innovation Fund (2020B-122) and National Natural Science Foundation of China (61063028,31560378)

摘要: 为了降低K-mediods聚类算法的误差并提高并行优化的性能,将混合蛙跳算法运用于聚类和并行优化过程。在K-mediods聚类过程中,将K-mediods与聚类簇思想相结合,对各个聚类簇进行混合蛙跳算法优化,从而提高了大规模数据样本聚类的效率。针对多个聚类执行节点并行完成大规模样本K-mediods聚类时,混合蛙跳算法有效提高了加速比。实验证明,相比于普通的K-mediods聚类,基于混合蛙跳算法的K-mediods聚类优势明显,且处理大规模样本的加速比性能更优。

关键词: K-mediods聚类, 并行优化, 混合蛙跳算法, 加速比, 聚类簇

Abstract: In order to reduce the error of K-mediods clustering algorithm and improve the performance of parallel optimization,the shuffled frog leaping algorithm is applied to the clustering and parallel optimization process.In the K-mediods clustering process,K-mediods is combined with the clustering cluster idea to optimize the shuffled frog leaping algorithm for each cluster cluster,which improves the efficiency of large-scale data sample clustering,especially for multiple clusters.When the class execution nodes complete the large-scale sample K-mediods clustering in parallel,the shuffled frog leaping algorithm effectively improves the speedup ratio.It has been proved by experiments that the K-mediods clustering based on the shuffled frog leaping algorithm has obvious clustering advantages compared with the common K-mediods clustering,and the acceleration ratio performance of processing large-scale samples is better.

Key words: Acceleration ratio, Clustering cluster, K-mediods clustering, Parallel optimization, Shuffled frog leaping algorithm

中图分类号: 

  • TP181
[1]QU J.Research on Intelligent Parallel Clustering Method forLarge Data in Virtual Environment[J].Computer Measurement and Control,2017,25(6):257-260.
[2]HONG Y H.Parallel computation based on MPI bee swarm K-means clustering algorithm [J].Computer Engineering and Design,2017,38(12):3339-3343.
[3]ZHAO B W,XU H.Parallel MRACO-PAM clustering algorithm based on MapReduce [J].Computer Engineering and Science,2017,39(10):1801-1806.
[4]ZHU F L,ZENG B,CAO J.Parallel optimization and implementation of SLAM algorithm based on particle filter [J].Journal of Guangdong University of Technology,2017,34(2):92-96.
[5]OUYANG C J,LIU C X,MING L,et al.An OMP Stegano-graphic Algorithm Optimized by SFLA[J].International Journal of Pattern Recognition & Artificial Intelligence,2017,31(1):496-505.
[6]TANG Z,LIU K,XIAO J,et al.A parallel k-means clustering algorithm based on redundance elimination and extreme points optimization employing MapReduce[J].Concurrency & Computation Practice & Experience,2017,29:e4109.
[7]LAI Z,FENG X,YU H,et al.A Parallel Social Spider Optimization Algorithm Based on Emotional Learning[J].IEEE Transactions on Systems,Man,and Cybernetics:Systems,2018,12(9):1-12.
[8]LIU P,TENG J Y,DING E J,et al.Large-scale text K-meansparallel clustering algorithm based on Spark [J].Chinese Journal of Information,2017,31(4):145-153.
[9]SOODI H A,VURAL A M.STATCOM Estimation UsingBack-Propagation,PSO,Shuffled Frog Leap Algorithm,and Genetic Algorithm Based Neural Networks[J].Computational Intelligence & Neuroscience,2018,2018(1):1-17.
[10]WANG N,GAO X J.A novel differential hybrid frog jump algorithm[J].Computer System Applications,2017,26(1):196-200.
[11]LU Y H,CHEN J H.Application of hybrid frog leaping algorithm in text classification feature selection optimization[J].Modern Library and Information Technology,2017,1(1):91-101.
[12]ZHAO H X,CHANG X G.An improved hybrid leaping frog algorithm[J].Journal of Lanzhou Jiaotong University,2017,36(1):51-56.
[13]TANG X Y,ZHANG X Z,ZHAO Y A.Big Data ClusteringMining Based on Swarm Intelligence Algorithm [J].Journal of Chongqing University of Technology(Natural Science),2019,33(4):128-133.
[14]ZHOU Z,SI G,CHEN J,et al.A novel method of transformer fault diagnosis based on k-mediods and decision tree algorithm[C]//International Conference on Electrical Materials & Power Equipment.2017.
[15]WANG Q,YANG J,ZHANG S S.A K-mediods clustering algorithm based on improved Drosophila optimization[J].Computer Technology and Development,2018,28(12):23-28.
[16]LIU J P,ZHANG W X,TANG Z H,et al.Adaptive Network Intrusion Detection Based on Fuzzy Rough Set Attribute Reduction and GMM-LDA Optimal Cluster Feature Learning [J].Control and Decision,2019,34(2):22-30.
[17]WANG Y,WANG L F,RAO Q F,et al.Radius-Adaptive on Selection of Initial Centers in K-Medoids Clustering [J].Journal of Chongqing University of Technology(Natural Science),2017,31(2):95-101.
[18]KHATAMI A,MIRGHASEMI S,KHOSRAVI A,et al.A new PSO-based approach to fire flame detection using K-Medoids clustering[J].Expert Systems with Applications,2017,68(C):69-80.
[19]WANG Y,WANG L F,RAO Q F,et al.K-medoids clustering algorithm for initial center point selection based on radius adaptation[J].Journal of Chongqing University of Technology(Natural Science Edition),2017,31(2):95-101.
[20]AGARWAL S,MEHTA S.Approximate Shortest DistanceComputing Using k-Medoids Clustering[J].Annals of Data Science,2017,4(4):547-564.
[1] 吕小敬, 刘钊, 褚学森, 石树鹏, 孟虹松, 黄震春.
面向超大规模并行模拟的LBM计算流体力学软件
Extreme-scale Simulation Based LBM Computing Fluid Dynamics Simulations
计算机科学, 2020, 47(4): 13-17. https://doi.org/10.11896/jsjkx.191000010
[2] 邓定胜.
一种改进的DBSCAN算法在Spark平台上的应用
Application of Improved DBSCAN Algorithm on Spark Platform
计算机科学, 2020, 47(11A): 425-429. https://doi.org/10.11896/jsjkx.190700071
[3] 杨思燕,贺国旗,刘如意.
基于SIFT算法的大场景视频拼接算法及优化
Video Stitching Algorithm Based on SIFT and Its Optimization
计算机科学, 2019, 46(7): 286-291. https://doi.org/10.11896/j.issn.1002-137X.2019.07.044
[4] 齐文, 鲍玉斌, 宋杰.
基于列存储的大数据采样查询处理
Column-oriented Store Based Sampling Query Process on Big Data
计算机科学, 2019, 46(12): 13-19. https://doi.org/10.11896/jsjkx.190500155
[5] 张新明, 程金凤, 康强, 王霞.
改进的混合蛙跳算法及其在多阈值图像分割中的应用
Improved Shuffled Frog Leaping Algorithm and Its Application in Multi-threshold Image Segmentation
计算机科学, 2018, 45(8): 54-62. https://doi.org/10.11896/j.issn.1002-137X.2018.08.010
[6] 刘玉成, 理查德·丁, 张颖超.
一种BPNNs识别算法的医学检测泛实时性问题研究
Research on Pan-real-time Problem of Medical Detection Based on BPNNs Recognition Algorithm
计算机科学, 2018, 45(6): 301-307. https://doi.org/10.11896/j.issn.1002-137X.2018.06.053
[7] 海沫,张游.
Spark平台下聚类算法的性能比较
Performance Comparison of Clustering Algorithms in Spark
计算机科学, 2017, 44(Z6): 414-418. https://doi.org/10.11896/j.issn.1002-137X.2017.6A.093
[8] 姜文超,林穗,王多强,李东明,金海.
Calculix三级并行优化及其在天河二号超级计算机中的应用
Three-level Parallel Optimization and Application of Calculix in TH-2 Super-computing Environments
计算机科学, 2017, 44(3): 32-35. https://doi.org/10.11896/j.issn.1002-137X.2017.03.008
[9] 米晓萍,李雪梅.
基于信息融合度传递的频域徙动入侵特征挖掘算法
Mining Algorithm of Frequency Domain Migration Intrusion Feature Based on Information Fusion Transfer
计算机科学, 2015, 42(3): 224-227. https://doi.org/10.11896/j.issn.1002-137X.2015.03.046
[10] 费雄伟,李肯立,阳王东,杜家宜.
基于CUDA的并行AES算法的实现和加速效率探索
Implementation and Exploring of Acceleration Efficiency of Parallel AES Algorithm on CUDA
计算机科学, 2015, 42(1): 59-62. https://doi.org/10.11896/j.issn.1002-137X.2015.01.013
[11] 马欣荣,刘三阳,段治健.
带状线性方程组的含参交替方向并行算法
Parallel Alternating Direction Algorithm with Parameters for Solving Banded Linear Systems
计算机科学, 2014, 41(2): 249-252.
[12] 陈达智,赵荣彩,姚远,韩林.
MPI自动并行化编译系统中消息传递代码生成算法
Message-passing Code Generation Algorithm in the MPI Automatic Parallelizing Compilation System
计算机科学, 2012, 39(6): 301-304.
[13] 刘晓芹,黄考利,安幼林,吕晓明.
改进的混合蛙跳算法在传感器配置优化中的应用
Application of Improved Shuffled Frog Leaping Algorithm in Optimum of Sensor Location
计算机科学, 2011, 38(2): 72-75.
[14] 祝永志,田甜.
基于高性能微机群集的可扩展性的研究与设计
Design and Implementation of Scalability Based on High Performance PCs Cluster
计算机科学, 2010, 37(12): 287-291.
[15] 杜云飞,唐玉华,杨学军.
容错并行算法的性能分析
Performance Evaluation for Fault-tolerant Parallel Algorithm
计算机科学, 2009, 36(9): 248-251.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!