计算机科学 ›› 2020, Vol. 47 ›› Issue (5): 149-153.doi: 10.11896/jsjkx.190300125

• 计算机图形学&多媒体 • 上一篇    下一篇

图像的扩散界面无监督聚类算法

王成章1, 白晓明2, 杜金栗1   

  1. 1 中央财经大学统计与数学学院 北京100081
    2 首都经济贸易大学信息学院 北京100070
  • 收稿日期:2019-03-25 出版日期:2020-05-15 发布日期:2020-05-19
  • 通讯作者: 王成章(czwang@cufe.edu.cn)
  • 基金资助:
    国家自然科学基金(71571197);北京市自然科学基金(9152016)

Diffuse Interface Based Unsupervised Images Clustering Algorithm

WANG Cheng-zhang1, BAI Xiao-ming2, DU Jin-li1   

  1. 1 School of Statistics and Mathematics,Central University of Finance and Economics,Beijing 100081,China
    2 Information School,Capital University of Economics and Business,Beijing 100070,China
  • Received:2019-03-25 Online:2020-05-15 Published:2020-05-19
  • About author:WANG Cheng-zhang,born in 1977,Ph.D,associate professor,master supervisor.His main research interests include machine learning,pattern recognition and big data analysis.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (71571197) and Natural Science Foundation of Beijing,China (9152016)

摘要: 图像的无监督聚类就是基于图像数据,在无任何先验信息的情况下将整个图像集合划分成若干子集的过程。由于图像的本征维度很高,在图像处理中会遇到“维数灾难”问题。针对图像无监督聚类的特点,提出了一种图像的扩散界面无监督聚类算法,将图像编码成高维观测空间中的点,再通过投影变换映射到低维特征空间,在低维特征空间中构建扩散界面无监督聚类模型,并在模型中引入维度约简算子,采用循环迭代算法优化扩散界面模型的能量函数。基于最优的扩散界面,将整个图像集合聚类成不同的子集。实验结果表明,扩散界面无监督聚类算法优于传统聚类算法中的K-means算法、DBSCAN算法和Spectral Clustering算法,能够更好地实现图像的无监督聚类,在相同条件下具有更高的准确度。

关键词: 扩散界面, 图像聚类, 维度约简, 无监督学习, 最优化

Abstract: Unsupervised clustering of images aims to partition the whole image set into several subsets on the basis of image data itself,while without any priori information.As dimensionality of an image is usually very high,curse of dimensionality arises du-ring the image processing.Having analyzed the problem of images clustering,a novel unsupervised image clustering algorithm is proposed.The proposed algorithm is based on diffused interface model on graph.Images were encoded as the data points in high dimensional observing space,and then were projected into low dimensional feature space.Diffuse interface model based unsupervised clustering algorithm was constructed in feature space,and dimension reduction operator was introduced into the model.Loop iterative algorithm was employed to optimize the energy function of diffuse interface model.The optimized diffuse interface was adopted to cluster images into different subsets.Experimental results show that the proposed algorithm is superior to traditional K-means,DBSCAN and Spectral Clustering algorithm.It achieves better clustering results and lower error rates.

Key words: Diffuse interface, Dimension reduction, Image clustering, Optimization, Unsupervised learning

中图分类号: 

  • TP391
[1]ALZU'BI A,AMIRA A,RAMZAN N.Semantic content-based image retrieval:A comprehensive study [J].Journal of Visual Communication and Image Representation,2015,32:20-54.
[2]ZHOU J X,LIU X,XU T W,et al.A new fusion approach for content based image retrieval with color histogram and local directional pattern [J].International Journal of Machine Learning and Cybernetics,2018,9(4):677-689.
[3]ZHANG F,KONG X W,NING F,et al.Image Retrieval by Extended Attribute Based on Web Search Amount[J]. Computer Engineering,2017,43 (9):276-280,287.
[4]CHENG Q,ZHANG Q,FU P,et al.A survey and analysis on automatic image annotation [J].Pattern Recognition,2018,79:242-259.
[5]FAN J,FAN Y.High dimensional classification using features annealed independence rules [J].Annals of Statistics,2008,36(6):2605-2637.
[6]BERTOZZI A L,FLENNER A.Diffuse interface models ongraphs for classification of high dimensional data [J].SIAM Review,2016,58(2):293-328.
[7]BERTOZZI A L,LUO X,STUART A M,et al.Uncertainty quantification in graph-based classification of high dimensional data [J].SIAM/ASA Journal on Uncertainty Quantification,2018,6(2):568-595.
[8]ANDERSON D,MCFADDEN G B,WHEELER A A,et al.Diffuse-interface methods in fluid mechanics [J].Annual Review of Fluid Mechanics,1997,30(1):139-165.
[9]AGRAWAL A,KARNICK H.Unsupervised Image clustering[D].Kanpur:Indian Institute of Technology,2009:1-6.
[10]WANG J,WANG J,SONG J,et al.Optimized cartesian k-means [J].IEEE Transactions on Knowledge and Data Engineering,2015,27(1):180-192.
[11]GOWDA K C,KRISHNA G.Agglomerative clustering using theconcept of mutual nearest neighbourhood [J].Pattern Recognition,1978,10(2):105-112.
[12]WANG W T,WU Y L,TANG C Y,et al.Adaptive density-based spatial clustering of applications with noise (DBSCAN) according to data[C]//2015 International Conference on Machine Learning and Cybernetics (ICMLC).IEEE,2015:445-451.
[13]TRON R,ZHOU X,ESTEVES C,et al.Fast multi-image matching via density-based clustering[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:4057-4066.
[14]LEI J,RINALDO A.Consistency of spectral clustering in stochastic block models [J].The Annals of Statistics,2015,43(1):215-237.
[15]CHEN J,LI Z,HUANG B.Linear spectral clustering superpixel [J].IEEE Transactions on Image Processing,2017,26(7):3317-3330.
[16]KRIZHEVSKY A,HINTON G.Learning multiple layers of features from tiny images[R].University of Toronto,2009.
[17]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[C]//Proceedings of the IEEE.1998:2278-2324.
[18]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes (voc) challenge [J].International Journal of Computer Vision,2010,88(2):303-338.
[19]CVL Face Database.Computer vision laboratory,University of Ljubljana,Slovenia[EB/OL].http://www.lrv.fri.uni-lj.si/facedb.html,2005.
[20]SAMARIA F S,HARTER A C.Parameterisation of a stochastic model for human face identification[C]//Proceedings of 1994 IEEE Workshop on Applications of Computer Vision.IEEE,1994:138-142.
[1] 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲.
基于无监督集群级的科技论文异质图节点表示学习方法
Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level
计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196
[2] 韩洁, 陈俊芬, 李艳, 湛泽聪.
基于自注意力的自监督深度聚类算法
Self-supervised Deep Clustering Algorithm Based on Self-attention
计算机科学, 2022, 49(3): 134-143. https://doi.org/10.11896/jsjkx.210100001
[3] 侯宏旭, 孙硕, 乌尼尔.
蒙汉神经机器翻译研究综述
Survey of Mongolian-Chinese Neural Machine Translation
计算机科学, 2022, 49(1): 31-40. https://doi.org/10.11896/jsjkx.210900006
[4] 陈扬, 王金亮, 夏炜, 杨颢, 朱润, 奚雪峰.
基于特征自动提取的足迹图像聚类方法
Footprint Image Clustering Method Based on Automatic Feature Extraction
计算机科学, 2021, 48(6A): 255-259. https://doi.org/10.11896/jsjkx.200900033
[5] 李向利, 贾梦雪.
基于预处理的超图非负矩阵分解算法
Nonnegative Matrix Factorization Algorithm with Hypergraph Based on Per-treatments
计算机科学, 2020, 47(7): 71-77. https://doi.org/10.11896/jsjkx.200200106
[6] 李金霞, 赵志刚, 李强, 吕慧显, 李明生.
改进的局部和相似性保持特征选择算法
Improved Locality and Similarity Preserving Feature Selection Algorithm
计算机科学, 2020, 47(6A): 480-484. https://doi.org/10.11896/JsJkx.20190800095
[7] 罗月,童卞,景帅,张蒙,饶永明,闫峰.
基于卷积去噪自编码器的芯片表面弱缺陷检测方法
Detection Method of Chip Surface Weak Defect Based on Convolution Denoising Auto-encoders
计算机科学, 2020, 47(2): 118-125. https://doi.org/10.11896/jsjkx.190100141
[8] 周昌, 李向利, 李俏霖, 朱丹丹, 陈世莲, 蒋丽榕.
基于余弦相似度的稀疏非负矩阵分解算法
Sparse Non-negative Matrix Factorization Algorithm Based on Cosine Similarity
计算机科学, 2020, 47(10): 108-113. https://doi.org/10.11896/jsjkx.190700112
[9] 任雪婷, 赵涓涓, 强彦, Saad Abdul RAUF, 刘继华.
联合成对学习和图像聚类的无监督肺癌亚型识别
Lung Cancer Subtype Recognition with Unsupervised Learning Combining Paired Learning and Image Clustering
计算机科学, 2020, 47(10): 200-206. https://doi.org/10.11896/jsjkx.190900073
[10] 王雅慧, 刘博, 袁晓彤.
基于近似牛顿法的分布式卷积神经网络训练
Distributed Convolutional Neural Networks Based on Approximate Newton-type Mothod
计算机科学, 2019, 46(7): 180-185. https://doi.org/10.11896/j.issn.1002-137X.2019.07.028
[11] 陈深进, 薛洋.
基于改进卷积神经网络的短时公交客流预测
Short-term Bus Passenger Flow Prediction Based on Improved Convolutional Neural Network
计算机科学, 2019, 46(5): 175-184. https://doi.org/10.11896/j.issn.1002-137X.2019.05.027
[12] 曾凡智, 周燕, 余家豪, 罗粤, 邱腾达, 钱杰昌.
基于无监督学习的二维工程CAD模型端到端检索算法
End-to-End Retrieval Algorithm of Two-dimensional Engineering CAD Model Based on Unsupervised Learning
计算机科学, 2019, 46(12): 298-305. https://doi.org/10.11896/jsjkx.190900003
[13] 黄冬梅, 杜艳玲, 贺琪, 随宏运, 李瑶.
基于多属性最优化的海洋监测数据副本布局策略
Marine Monitoring Data Replica Layout Strategy Based on Multiple Attribute Optimization
计算机科学, 2018, 45(6): 72-75. https://doi.org/10.11896/j.issn.1002-137X.2018.06.012
[14] 池凯凯, 林一民, 李燕君, 程珍.
能量捕获传感网中吞吐量最大化的占空比方案
Duty Cycle Scheme Maximizing Throughput in Energy Harvesting Sensor Networks
计算机科学, 2018, 45(6): 100-104. https://doi.org/10.11896/j.issn.1002-137X.2018.06.017
[15] 李锋,谢嗣弘.
基于无监督学习的移动心电信号异常诊断研究
Study on Abnormal Diagnosis of Moving ECG Signals Based on Unsupervised Learning
计算机科学, 2017, 44(Z11): 68-71. https://doi.org/10.11896/j.issn.1002-137X.2017.11A.013
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!