计算机科学 ›› 2016, Vol. 43 ›› Issue (6): 229-232.doi: 10.11896/j.issn.1002-137X.2016.06.046

• 人工智能 • 上一篇    下一篇

基于混沌关联维特征提取的大数据聚类算法

谢川   

  1. 空军工程大学航空航天工程学院 西安710038
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受陕西自然科学基金:无铅焊点在多场耦合作用下的失效行为及寿命预测方法(2015JM6345)资助

Big Data Clustering Algorithm Based on Chaotic Correlation Dimensions Feature Extraction

XIE Chuan   

  • Online:2018-12-01 Published:2018-12-01

摘要: 大数据聚类过程是一个随机的非线性处理过程,具有很高的不确定性。 由于传统方法需要先验知识进行学习,不能很好地适应大数据的实时变化情况,无法有效实现大数据聚类,因此提出一种基于混沌关联特征提取的大数据聚类算法。分析了传统方法的弊端,通过重构相空间建立了一个多维的状态空间向量与混沌轨迹,使原系统中很多几何特征量保持不变,为分析原系统的混沌特征提供有效依据。将平均互信息量取第一个最小值时的横坐标所指的时间延迟作为重构相空间的最佳时间延迟,采用虚假最近邻点算法对最佳嵌入维数进行选择。将提取的关联维数这一特征量作为大数据聚类的混沌特征量,依据提取的混沌关联维特征对大数据进行聚类。仿真实验表明,所提算法能够有效提高数据的聚类效率,减少能耗,是一种有效的数据聚类方法。

关键词: 混沌关联维特征,大数据,聚类

Abstract: Big data clustering process is a kind of stochastic nonlinear processing and has very high uncertainty.Because the traditional methods need prior knowledge to learn,are not good to adapt to the real-time change situation of big data and unable to effectively implement large data clustering,we put forward a kind of big data clustering method based on chaotic correlation feature extraction.We analyzed the disadvantages of the traditional methods,established a multidimensional state space vector and the chaotic trajectory by phase space reconstruction.Much of the geometry characte-ristic information in the original system remains same,which provides the effective basis for the analysis of chaotic cha-racteristics of the original system.Time delay referred by the abscissa when the average mutual information obtains the first minimum is as the best time delay of reconstructing phase space,and the false nearest neighbor algorithm is used to select the best embedding dimension.The extracted correlation dimension is used as the haotic correlation characteristics of bige data clustering,and big data is clustered based on the extracted chaos correlation dimension feature .The simulation results show that the proposed algorithm can effectively improve the efficiency of the clustering of data,reduce energy consumption,and is an effective method of data clustering.

Key words: Chaos correlation dimension feature,Big data,Clustering

[1] Yang Ling,Zheng Si-yi.Ship radiated noise feature extractionbased on chaos theory [J].Journal of Naval Engineering University,2014(4):50-54(in Chinese) 杨玲,郑思仪.基于混沌理论的舰船辐射噪声特征提取[J].海军工程大学学报,2014(4):50-54
[2] Fu Qiang,Li Chen-xi,Zhang Chao-xi.Chaotic correlation dimension algorithm for G-P discussion [J].Journal of PLA University of Science and Technology(Natural Science Edition),2014(3):275-282(in Chinese) 付强,李晨溪,张朝曦.关于G-P算法计算混沌关联维的讨论[J].解放军理工大学学报(自然科学版),2014(3):275-282
[3] Chang Yong-zhi,Qiu Ya-ze,Zheng Zhen,et al.Based on the non-linear correlation dimension of feature extraction of mechanical automation monitoring system [J].Computer and Digital Engineering,2014(12):2311-2315(in Chinese) 常勇智,邱亚泽,郑振,等.基于非线性关联维特征提取的机械自动化监测系统[J].计算机与数字工程,2014(12):2311-2315
[4] Xiao Fei,Qi Li-lei.Big data processing technology and exploration [J].Computer and modernization,2013(9):75-77(in Chinese) 肖飞,齐立磊.大数据处理技术与探索[J].计算机与现代化,2013(9):75-77
[5] Wang Bin,Wang Chao,Li Jing.Big differences between the network abnormal data feature detection algorithm simulation ana-lysis [J].Computer Simulation,2013,30(8):277-280(in Chinese) 王斌,王超,李晶.大差异网络异常数据特征检测算法的仿真分析[J].计算机仿真,2013,0(8):277-280
[6] Sun Hai-jun.Big data processing based on cloud computing technology [J].Journal of Information Security and Technology,2014(11):61-63(in Chinese) 孙海军.基于云计算的大数据处理技术[J].信息安全与技术,2014(11):61-63
[7] Han Yan,Li Xiao.Speed up big data clustering K-means algorithm improvement[J].Computer Engineering and Design,2015,6(5):1317-1320(in Chinese) 韩岩,李晓.加速大数据聚类K-means算法的改进[J].计算机工程与设计,2015,6(5):1317-1320
[8] Yang Zhen,Xu Min-jie,Liu Zhang-feng,et al.Big data information processing architecture and key technology research [J].Journal of Telecom Science,2013,29(11):1-5(in Chinese) 杨震,徐敏捷,刘璋峰,等.语音大数据信息处理架构及关键技术研究[J].电信科学,2013,9(11):1-5
[9] Guan Tian-yun,Hou Chun-hua.Big data technology in the application of intelligent pipe huge amounts of data analysis and mi-ning[J].Journal of Modern Telecommunication Technology,2014,2(1):71-79(in Chinese) 管天云,侯春华.大数据技术在智能管道海量数据分析与挖掘中的应用[J].现代电信科技,2014,2(1):71-79
[10] Sun Ting,Zhang Jin-hua,Geng Guo-hua.3d model based on local features of probability density estimation feature extraction [J].Computer Science,2015,42(6):293-29(in Chinese) 孙挺,张锦华,耿国华.基于局部特征概率密度估计的三维模型特征提取[J].计算机科学,2015,2(6):293-295
[11] Zhong Ji-yuan,Mei Kui-zhi,Wen Zhe-xi.GIST feature extraction of heterogeneous concurrent flow computing implementation [J].Computer Engineering and Applications,2015,1(6):139-144(in Chinese) 仲济源,梅魁志,温哲西.GIST特征提取的异构并发流计算实现[J].计算机工程与应用,2015,1(6):139-144
[12] Li Kang,Liu Dong.Development Research on Data-Intensive Computing Towards Massive Data Processing[J].Journal of Sichuan Ordnance,2015,6(7):93-96(in Chinese) 李亢,刘东.面向海量数据处理的数据密集型计算发展研究[J].四川兵工学报,2015,6(7):93-96

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!