Computer Science ›› 2024, Vol. 51 ›› Issue (11A): 240100095-7.doi: 10.11896/jsjkx.240100095

• Big Data & Data Science • Previous Articles     Next Articles

Online and Offline Multi-source Heterogeneous Data Fusion System for Recycling Information

QIU Mingxin, LEI Shuai, LIU Xianhui, ZHANG Yingyao   

  1. School of Electronics and Information Engineering,Tongji University,Shanghai 201804,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:QIU Mingxin,born in 2000,postgra-duate.His main research interests include machine learning and big data.
    ZHANG Yinyao,born in 1984,Ph.D,associate professor.Her main research interests include machine learning and big data.
  • Supported by:
    Key Research and Development Program of China(2022YFB3305802).

Abstract: In the recycling process of waste products in the resource recycling industry,a large number of multi-source hetero-geneous data will be generated due to the collaborative work of multiple systems.Aiming at the problem that the online and offline recycling information of waste products is difficult to fuse and effectively use,an online and offline multi-source heteroge-neous data fusion system for recycling information is proposed.Firstly,the system uses the Web API to realize the data access of online and offline multi-source heterogeneous data,and completes the pretreatment of it through the steps of data parsing,data cleaning and data conversion.Secondly,aiming at the problem that the existing data fusion methods based on clustering analysis usually need to specify the number of clusters in advance in the fusion process,a fusion method based on multi-objective clustering is proposed,which aims to automatically determine the number of clusters in the fusion process.Through feature selection,label co-ding,data conversion and normalization of the preprocessed data,combined with the multi-objective clustering algorithm,feature extraction and clustering of typical data is completed,and data matching based on Euclidean distance is performed for the total and incremental data.Finally,the system uses a distributed database scheme based on MyCat middleware and MySQL master-slave replication to realize the storage,sharing and exchange of fusion data.The test shows that the data fusion system can rea-lize the data fusion,sharing and exchange of online and offline multi-source heterogeneous recycling information of waste pro-ducts.At the same time,compared to the method based on K-Means,the proposed data fusion method based on multi-objective clustering can automatically determine the optimal cluster number on different data sets,and can obtain the compactness and separation no worse than that of the K-Means fusion method.

Key words: Clustering, Multi-objective optimization, Multi-source heterogeneous data, Data fusion

CLC Number: 

  • TP391
[1]DU H Z,LV Z,SONG S W,et al.The Development Trends of International Resource Recycling Industry and China's Response during the 14th Five Year Plan Period under the “Double Carbon” Goal [J].Macroeconomic Research,2022(7):120-128.
[2]XIA W,CAI W T,LIU Y,et al.Multi source heterogeneous da-ta fusion in distribution networks based on joint Kalman filtering [J].Power System Protection and Control,2022,50(10):180-187.
[3]LI W,WEI D Y,LU Y,et al.Research on Vehicle Autonomous Location Method Based on Heterogeneous Feature Information Matching [J].Navigation Location and Timing,2019(3):75-81.
[4]LIN Y,CHEN R C,JIN T.Multi source heterogeneous data fusion technology for complex information systems [J].China Testing,2020,46(7):1-7,23.
[5]KU X B,ZHANG H L,YANG S.Data Management Platform of Smart Community Based on XML Format Fusion of Multi-Source Heterogeneous Data [J].Electric Power Survey and Design,2023(8):1-5,17.
[6]LI L,WANG W.Network heterogeneous information integrated management system based on improved RNN multi-source fusion algorithm [J].Journal of Xi'an University of Engineering,2023,37(6):145-152.
[7]TAN J D,LI B,LIU C Y,et al.Research on the fusion proces-sing method of multi-source electromechanical state data for highways [J].Highway,2023,68(8):275-281.
[8]ALHGAISH A,ALZYADAT W,AL-FAYOUMI M,et al.Preserve quality medical drug data toward meaningful data lake by cluster[J].International Journal of Recent Technology and Engineering,2019,8(3):270-277.
[9]HUI G B.A deep learning based multi-source heterogeneous data fusion method [J].Modern Navigation,2017,8(3):218-223.
[10]HANDL J,KNOWLES J.Multi-objective clustering and cluster validation[J].Multi-objective Machine Learning,2006,16(21):21-47.
[11]JOSÉ-GARCÍA A,GÓMEZ-FLORES W.Automatic clustering using nature-inspired metaheuristics:A survey[J].Applied Soft Computing,2016,41:192-213.
[12]HANDL J,KNOWLES J.Evidence accumulation in multiobjective data clustering[C]//International Conference on Evolutio-nary Multi-Criterion Optimization.Berlin,Heidelberg:Springer Berlin Heidelberg,2013:543-557.
[13]BANDYOPADHYAY S,MUKHOPADHYAY A,MAULIKU.An improved algorithm for clustering gene expression data[J].Bioinformatics,2007,23(21):2859-2865.
[14]FACELI K,DE SOUTO M C P,DE ARAUJOD S A,et al.Multi-objective clustering ensemble for gene expression data analysis[J].Neurocomputing,2009,72(13/14/15):2763-2774.
[15]MUKHOPADHYAY A,MAULIK U,BANDYOPADHYAY S.An interactive approach to multiobjective clustering of gene expression patterns[J].IEEE Transactions on Biomedical Engineering,2012,60(1):35-41.
[16]MAULIK U,MUKHOPADHYAY A,BANDYOPADHYAYS.Combining pareto-optimal clusters using supervised learning for identifying co-expressed genes[J].BMC Bioinformatics,2009,10(1):1-16.
[17]GUPTA A,ONG Y S,FENG L.Multifactorial evolution:toward evolutionary multitasking[J].IEEE Transactions on Evolutio-nary Computation,2015,20(3):343-357.
[18]OMIDVAR M N,LI X,MEI Y,et al.Cooperative co-evolution with differential grouping for large scale optimization[J].IEEE Transactions on Evolutionary Computation,2013,18(3):378-393.
[19]WANG R,LAI S,WU G,et al.Multi-clustering via evolutionary multi-objective optimization[J].Information Sciences,2018,450:128-140.
[20]DEB K,PRATAP A,AGARWAL S,et al.A fast and elitist multiobjective genetic algorithm:NSGA-II[J].IEEE Transactions on Evolutionary Computation,2002,6(2):182-197.
[21]SRINIVAS N,DEB K.Muiltiobjective optimization using nondominated sorting in genetic algorithms[J].Evolutionary Computation,1994,2(3):221-248.
[22]HANCER E,KARABOGA D.A comprehensive survey of traditional,merge-split and evolutionary approaches proposed for determination of cluster number[J].Swarm and Evolutionary Computation,2017,32:49-67.
[23]MEI W J,ZHENG J,JIN J,et al.Multi sensor asynchronous information fusion method based on sliding clustering [J].Journal of Instrumentation,2022,43(6):109-117.
[1] LI Zekai, ZHONG Jiaqing, FENG Shaojun, CHEN Juan, DENG Rongyu, XU Tao, TAN Zhengyuan, ZHOU Kexing, ZHU Pengzhi, MA Zhaoyang. CPU Power Modeling Accuracy Improvement Method Based on Training Set Clustering Selection [J]. Computer Science, 2024, 51(9): 59-70.
[2] ZHOU Yu, YANG Junling, DANG Kelin. Change Detection in SAR Images Based on Evolutionary Multi-objective Clustering [J]. Computer Science, 2024, 51(9): 140-146.
[3] WANG Yiyang, LIU Fagui, PENG Lingxia, ZHONG Guoxiang. Out-of-Distribution Hard Disk Failure Prediction with Affinity Propagation Clustering and Broad Learning Systems [J]. Computer Science, 2024, 51(8): 63-74.
[4] WANG Xingeng, DU Tao, ZHOU Jin, CHEN Di, WU Yunzheng. Adaptive Density Peak Clustering Algorithm Based on Shared Nearest Neighbor [J]. Computer Science, 2024, 51(8): 97-105.
[5] SUN Haowen, DING Jiaman, LI Bowen, JIA Lianyin. Clustering Algorithm Based on Attribute Similarity and Distributed Structure Connectivity [J]. Computer Science, 2024, 51(7): 124-132.
[6] CHEN Jie, JIN Linjiang, ZHENG Hongbo, QIN Xujia. Deep Feature Learning and Feature Clustering of Streamlines in 3D Flow Fields [J]. Computer Science, 2024, 51(7): 221-228.
[7] LI Shuai, YU Juan, WU Shaocheng. Cross-lingual Text Topic Discovery Based on Ensemble Learning [J]. Computer Science, 2024, 51(6A): 230300201-8.
[8] HAN Lijun, WANG Peng, LI Ruixu, LIU Zhongyao. Dual Direction Vectors-based Large-scale Multi-objective Evolutionary Algorithm [J]. Computer Science, 2024, 51(6A): 230700155-11.
[9] SU Ruqi, BIAN Xiong, ZHU Songhao. Few-shot Images Classification Based on Clustering Optimization Learning [J]. Computer Science, 2024, 51(6A): 230300227-7.
[10] ZHU Jun, ZHANG Guoyin, WAN Jingjing. Study on Data Security Framework Based on Identity and Blockchain Integration [J]. Computer Science, 2024, 51(6A): 230400056-5.
[11] LI Zi, ZHOU Yu. Sequence-based Program Semantic Rule Mining and Violation Detection [J]. Computer Science, 2024, 51(6): 78-84.
[12] XIE Genlin, CHENG Guozhen, LIANG Hao, WANG Qingfeng. Software Diversity Composition Based on Multi-objective Optimization Algorithm NSGA-II [J]. Computer Science, 2024, 51(6): 85-94.
[13] HE Yifan, HE Yulin, CUI Laizhong, HUANG Zhexue. Subspace-based I-nice Clustering Algorithm [J]. Computer Science, 2024, 51(6): 153-160.
[14] ZHANG Zhiyuan, ZHANG Weiyan, SONG Yuqiu, RUAN Tong. Multilingual Event Detection Based on Cross-level and Multi-view Features Fusion [J]. Computer Science, 2024, 51(5): 208-215.
[15] WANG Hancheng, DAI Haipeng, CHEN Zhipeng, CHEN Shusen, CHEN Guihai. Large-scale Network Community Detection Algorithm Based on MapReduce [J]. Computer Science, 2024, 51(4): 11-18.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!