Computer Science ›› 2025, Vol. 52 ›› Issue (2): 134-144.doi: 10.11896/jsjkx.240800040
• Database & Big Data & Data Science • Previous Articles Next Articles
ZHANG Manjing1, HE Yulin1,2, LI Xu1, HUANG Zhexue2
CLC Number:
[1]AHMED M,SERAJ R,ISLAM S M S.The k-means algorithm:A comprehensive survey and performance evaluation [J].Electronics,2020,9(8):1295. [2]IKOTUN A M,EZUGWU A E,ABUALIGAH L,et al.A comprehensive review of K-means clustering algorithms:Variants analysis and advances in big data era[J].Information Sciences,2023,622:178-210. [3]NIELSEN F,NIELSEN F.Hierarchical clustering [M]//Introduction to HPC with MPI for Data Science.Springer International Publishing,2016:195-211. [4]JACKSI K,IBRAHIM R K,ZEEBAREE S R M,et al.Clustering documents based on semantic similarity using HAC and K-means algorithms [C]//Proceedings of the 2020 International Conference on Advanced Science and Engineering(ICOASE).IEEE,2020:205-210. [5]RAMADHANI F,ZARLIS M,SUWILO S.Improvement of the BIRCH algorithm for big data clustering [C]//IOP Conference Series:Materials Science and Engineering.IOP Publishing,2020,725(1):012090. [6]ZHANG T,RAMAKRISHNAN R,LIVNY M.BIRCH:an efficient data clustering method for very large databases [J].ACM Sigmod Record,1996,25(2):103-114. [7]ESTER M,KRIEGEL H P,SANDER J,et al.A density-based algorithm for discovering clusters in large spatial databases with noise [C]//Proceedings of the Second International Conference on Knowledge Discovery and Data Mining(KDD'96).AAAI Press,1996:226-231. [8]DENG D.DBSCAN clustering algorithm based on density [C]//Proceedings of the 2020 7th International Forum on Electrical Engineering and Automation(IFEEA).IEEE,2020:949-953. [9]ANKERST M,BREUNIG M M,KRIEGEL H P,et al.OP-TICS:Ordering points to identify the clustering structure [J].ACM Sigmod Record,1999,28(2):49-60. [10]AGRAWAL R,GEHRKE J,GUNOPULOS D,et al.Automatic subspace clustering of high dimensional data for data mining applications [C]//Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data.ACM,1998:94-105. [11]WANG W,YANG J,MUNTZ R.STING:A statistical information grid approach to spatial data mining [C]//Proceedings of the 23rd International Conference on Very Large Data Bases(VLDB'97).Morgan Kaufmann Publishers Inc.,1997:186-195. [12]MOON T K.The expectation-maximization algorithm [J].IEEE Signal Processing Magazine,1996,13(6):47-60. [13]DO C B,BATZOGLOU S.What is the expectation maximization algorithm? [J].Nature Biotechnology,2008,26(8):897-899. [14]KAUFFMANN J,ESDERS M,RUFF L,et al.From clustering to cluster explanations via neural networks [J].IEEE Transactions on Neural Networks and Learning Systems,2022,35(2):1926-1940. [15]REN Y,PU J,YANG Z,et al.Deep clustering:A comprehensive survey [J].arXiv:2210.04142,2022. [16]MAHDI M A,HOSNY K M,ELHENAWY I.Scalable clustering algorithms for big data:A review [J].IEEE Access,2021,9:80015-80027. [17]OTHMAN S M,BA-ALWI F M,ALSOHYBE N T,et al.Intrusion detection model using machine learning algorithm on Big Data environment [J].Journal of Big Data,2018,5(1):1-12. [18]LI J,IZAKIAN H,PEDRYCZ W,et al.Clustering-based anoma-ly detection in multivariate time series data [J].Applied Soft Computing,2021,100:106919. [19]LI G,LIU F,SHARMA A,et al.Research on the natural language recognition method based oncluster analysis using neural network [J].Mathematical Problems in Engineering,2021,2021(1):9982305. [20]CHOWDHURY G G.Natural language processing [J].Annual Review of Information Science and Technology,2003,37(1):51-89. [21]SILVERSTEIN C,MARAIS H,HENZINGER M,et al.Analysis of a very large web search engine query log [C]//Procee-dings of the ACM SIGIR Forum.ACM,1999:6-12. [22]ZHAO S,ZHU L,WANG X,et al.Centerclip:Token clustering for efficient text-video retrieval [C]//Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval.2022:970-981. [23]MENG Q,ZHAO S,HUANG Z,et al.Magface:A universalrepresentation for face recognition and quality assessment [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:14225-14234. [24]YANG L,CHEN D,ZHAN X,et al.Learning to cluster faces via confidence and connectivity estimation [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:13369-13378. [25]SAMBASIVAM S,THEODOSOPOULOS N.Advanced dataclustering methods of mining Web documents [J].Issues in Informing Science & Information Technology,2006,3:563-579. [26]SUN J G,LIU J,ZHAO L Y.Research on Clustering Algorithms [J].Journal of Software,2008,19(1):48-61. [27]OYEWOLE G J,THOPIL G A.Data clustering:application and trends [J].Artificial Intelligence Review,2023,56(7):6439-6475. [28]HASHEMI S E,GHOLIAN-JOUYBARI F,HAJIAGHAEI-KESHTELI M.A fuzzy C-means algorithm for optimizing data clustering [J].Expert Systems with Applications,2023,227:120377. [29]HUANG Z.Extensions to the k-means algorithm for clustering large data sets with categorical values [J].Data Mining and Knowledge Discovery,1998,2(3):283-304. [30]BAHMANI B,MOSELEY B,VATTANI A,et al.Scalable K-Means+ [J].Proceedings of the VLDB Endowment,2012,5(7):622-633. [31]MAO Y M,GAN D J,MWAKAPESA D S,et al.A MapReduce-based K-means clustering algorithm [J].The Journal of Supercomputing,2022,78(4):5181-5202. [32]SARDAR T H,ANSARI Z.Distributed big data clustering using MapReduce-based fuzzy C-medoids [J].Journal of The Institution of Engineers(India):Series B,2022,103(1):73-82. [33]ZHANG T,RAMAKRISHNAN R,LIVNY M.BIRCH:an efficient data clustering method for very large databases [J].ACM Sigmod Record,1996,25(2):103-114. [34]RAMADHANI F,ZARLIS M,SUWILO S.Improvement of the BIRCH algorithm for big data clustering [C]//IOP Conference Series:Materials Science and Engineering.IOP Publishing,2020. [35]SCHUBERT E,SANDER J,ESTER M,et al.DBSCAN revisited,revisited:why and how you should(still) use DBSCAN [J].ACM Transactions on Database Systems(TODS),2017,42(3):1-21. [36]ANKERST M,BREUNIG M M,KRIEGEL H P,et al.OP-TICS:Ordering points to identify the clustering structure [J].ACM Sigmod Record,1999,28(2):49-60. [37]HINNEBURG A,KEIM D A.A general approach to clustering in large databases with noise [J].Knowledge and Information Systems,2003,5:387-415. [38]MU C,HOU Y,ZHAO J,et al.Stream-DBSCAN:A streaming distributed clustering model for water quality monitoring [J].Applied Sciences,2023,13(9):5408. [39]WU G,CAO L,TIAN H,et al.HY-DBSCAN:A hybrid parallel DBSCAN clustering algorithm scalable on distributed-memory computers [J].Journal of Parallel and Distributed Computing,2022,168:57-69. [40]XIE M,AREF W G.PD-DBSCAN:A density-based clusteringalgorithm for exploratory data analysis and data mining [J].Knowledge and Information Systems,2013,37(3):685-722. [41]KAUFFMANN J,ESDERS M,RUFF L,et al.From clustering to cluster explanations via neural networks [J].IEEE Transactions on Neural Networks and Learning Systems,2022,35(2):1926-1940. [42]LIU Y,TU W,ZHOU S,et al.Deep graph clustering via dual correlation reduction [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:7603-7611. [43]COCHRAN W G.Sampling Techniques [M].John Wiley & Sons,1977. [44]LECUN Y,CORTES C,BURGES C.The MNIST database of handwritten digits [J].Neural Computation,1998,10(5):1191-1232. [45]HETTICH S,KEGELMEYER W P.The UCI KDD Archive.KDD Cup 1999 Data[EB/OL].https://archive.ics.uci.edu/ml/datasets/KDD+Cup+1999+Data. |
[1] | LIAO Qiucheng, ZHOU Yang, LIN Xinhua. Metrics and Tools for Evaluating the Deviation in Parallel Timing [J]. Computer Science, 2025, 52(5): 41-49. |
[2] | HUANG Chenxi, LI Jiahui, YAN Hui, ZHONG Ying, LU Yutong. Investigation on Load Balancing Strategies for Lattice Boltzmann Method with Local Grid Refinement [J]. Computer Science, 2025, 52(5): 101-108. |
[3] | XU He, ZHOU Tao, LI Peng, QIN Fangfang, JI Yimu. LU Parallel Decomposition Optimization Algorithm Based on Kunpeng Processor [J]. Computer Science, 2024, 51(9): 51-58. |
[4] | WANG Hancheng, DAI Haipeng, CHEN Zhipeng, CHEN Shusen, CHEN Guihai. Large-scale Network Community Detection Algorithm Based on MapReduce [J]. Computer Science, 2024, 51(4): 11-18. |
[5] | ZHONG Zhenyu, LIN Yongliang, WANG Haotian, LI Dongwen, SUN Yufei, ZHANG Yuzhi. Automatic Pipeline Parallel Training Framework for General-purpose Computing Devices [J]. Computer Science, 2024, 51(12): 129-136. |
[6] | LI Siyao, LI Shanglin, LUO Jingzhi. Parallel Computing of Reentry Vehicle Trajectory by Multiple Shooting Method Based onOPENMP [J]. Computer Science, 2024, 51(11A): 231000019-6. |
[7] | PENG Weidong, GUO Wei, WEI Lin. Reconfigurable Computing System for Parallel Implementation of SVM Training Based on FPGA [J]. Computer Science, 2024, 51(11A): 231100120-7. |
[8] | WANG Xiaozhong, ZHANG Zuyu. Multi Level Parallel Computing for SW26010 Discontinuous Galerkin Finite Element Algorithm [J]. Computer Science, 2024, 51(11A): 240700055-5. |
[9] | HE Weilong, SU Lingli, GUO Bingxuan, LI Maosen, HAO Yan. Research and Implementation of Dynamic Scene 3D Perception Technology Based on BinocularEstimation [J]. Computer Science, 2024, 51(11A): 240300045-8. |
[10] | ZHAI Xulun, ZHANG Yongguang, JIN Anzhao, QIANG Wei, LI Mengbing. Parallel DVB-RCS2 Turbo Decoding on Multi-core CPU [J]. Computer Science, 2023, 50(6): 22-28. |
[11] | HAN Qiqi, LIU Xin. Application of Air-Sea Coupled Mode in High-speed Interconnection Environment [J]. Computer Science, 2023, 50(11A): 221000136-5. |
[12] | DING Yue, XU Chuanfu, QIU Haozhong, DAI Weixi, WANG Qingsong, LIN Yongzhen, WANG Zhenghua. Study on Cross-platform Heterogeneous Parallel Computing for Lattice Boltzmann Multi-phase Flow Simulations Based on SYCL [J]. Computer Science, 2023, 50(11): 32-40. |
[13] | FENG Chen, GU Jingjing. Efficient Distributed Training Framework for Federated Learning [J]. Computer Science, 2023, 50(11): 317-326. |
[14] | CHEN Xin, LI Fang, DING Hai-xin, SUN Wei-ze, LIU Xin, CHEN De-xun, YE Yue-jin, HE Xiang. Parallel Optimization Method of Unstructured-grid Computing in CFD for DomesticHeterogeneous Many-core Architecture [J]. Computer Science, 2022, 49(6): 99-107. |
[15] | ZHU Ruo-chen, YANG Chang-chun, ZHANG Deng-hui. EGOS-DST:Efficient Schema-guided Approach to One-step Dialogue State Tracking for Diverse Expressions [J]. Computer Science, 2022, 49(11A): 210900246-7. |
|