Computer Science ›› 2025, Vol. 52 ›› Issue (4): 161-168.doi: 10.11896/jsjkx.240600008

• Database & Big Data & Data Science • Previous Articles     Next Articles

Semi-supervised Partial Multi-label Feature Selection

WU You1,2, WANG Jing1,2, LI Peipei1,2, HU Xuegang1,2,3   

  1. 1 School of Computer Science and Information Engineering,Hefei University of Technology,Hefei 230601,China
    2 Key Laboratory of Knowledge Engineering with Big Data,Hefei University of Technology,Hefei 230601,China
    3 Anhui Province Key Laboratory of Industry Safety and Emergency Technology,Hefei University of Technology,Hefei 230601,China
  • Received:2024-05-30 Revised:2024-08-21 Online:2025-04-15 Published:2025-04-14
  • About author:WU You,born in 2002,master candidate.His main research interests include feature selection and multi-label learning.
    LI Peipei,born in 1982,Ph.D,professor,Ph.D supervisor.Her main research interests include data stream mining and knowledge engineering.
  • Supported by:
    National Natural Science Foundation of China(62376085,62076085,62120106008) and Research Funds of Center for Big Data and Population Health of IHM(JKS2023003).

Abstract: Multi-label feature selection is a technique for reducing feature dimensionality by filtering out a subset of features with distinguishing power from the original feature space.However,the traditional method faces the problem of labeling accuracy degradation.Real data instances are labeled with a set of candidate labels,which may include noise labels in addition to relevant labels,resulting in biased multi-label data.Existing multi-label feature selection algorithms typically assume accurate labeling of training samples or only consider missing labels.Furthermore,large-scale high-dimensional multi-labeled datasets in real situations often have only a small portion of labeled data.Therefore,this paper presents a new semi-supervised biased multi-label feature selection method.Firstly,considering the partial multi-label issue,this paper learns the true relationships between labels from samples with known labels.Then,the structural consistency between the feature space and the label space is maintained by using the stream regularization technique.Secondly,considering the label missing issue,this paper considers unlabeled data and enhance the label information by a label propagation algorithm.Additionally,considering the high-dimensional feature,this paper applies low-rank constraints to the mapping matrix to expose implicit connections between labels.It also selects features with strong distinguishing ability by introducing l2,1 norm constraints.Experimental results demonstrate significant performance advantages of our method compared to existing semi-supervised multi-label feature selection methods.

Key words: Multi-label feature selection, Partial multi-label learning, Semi-supervised learning, Feature dimension reduction, Noisylabels

CLC Number: 

  • TP181
[1]ZHANG M L,ZHOU Z H.A review on multi-label learning algorithms [J].IEEE Transactions on Knowledge and Data Engineering,2013,26(8):1819-1837.
[2]DONOHO D L.High-dimensional data analysis:The curses and blessings of dimensionality [J].AMS Math Challenges Lecture,2000,1(2000):32.
[3]LI Z Q,DU J Q,NIE B,et al.Summary of Feature Selection Methods [J].Computer Engineering and Applications,2019,55(24):10-19.
[4]HUANG Q,YAMADA M,TIAN Y,et al.Graph LIME:Local Interpretable Model Explanations for Graph Neural Networks [J].IEEE Transactions on Knowledge and Data Engineering,2023,35(7):6968-6972.
[5]XIE M K,HUANG S J.Partial multi-label learning [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2018:4302-4309.
[6]SHI D,ZHU L,LI J,et al.Binary label learning for semi-supervised feature selection [J].IEEE Transactions on Knowledge and Data Engineering,2021,35(3):2299-2312.
[7]WANG J,LI P P,YU K.Partial multi-label feature selection[C]//International Joint Conference on Neural Networks.IEEE,2022:1-9.
[8]ZHANG M L,FANG J P.Partial multi-label learning via credible label elicitation [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(10):3587-359.
[9]WANG H,LIU W,ZHAO Y,et al.Discriminative and correlative partial multi-label learning [C]//Proceedings of the International Joint Conferences on Artificial Intelligence.2019:3691-3697.
[10]YU G,CHEN X,DOMENICONI C,et al.Feature-induced partial multi-label learning [C]//IEEE International Conference on Data Mining.2018:1398-1403.
[11]LI Z,LYU G,FENG S.Partial multi-label learning via multi-subspace representation [C]//Proceedings of the International Joint Conference on Artificial Intelligence.2021:2612-2618.
[12]LV S,SHI S,WANG H,LI F.Semi-supervised multi-label feature selection with adaptive structure learning and manifold learning [J].Knowledge-Based Systems,2021,214:106757.
[13]XU Y,WANG J,AN S,et al.Semi-supervised multi-label fea-ture selection by preserving feature-label space consistency [C]//Proceedings of the International Conference on Information and Knowledge Management.2018:783-792.
[14]WANG X,CHEN R,HONG C,et al.Semi-supervised multi-label feature selection via label correlation analysis with l1-norm graph embedding [J].Image and Vision Computing,2017,63:10-23.
[15]ALALGA A,BENABDESLEM K,TALEB N.Soft-constrainedlaplacian score for semi-supervised multi-label feature selection [J].Knowledge and Information Systems,2016,47(1):75-98.
[16]WANG P,XIN P,LIU Y,et al.Extracting node center coordinates of point clouds in reticulated shell structure using least squares method [J].Journal of Graphics,2024,45(1):183-190.
[17]WU H X,HAN M,CHEN Z Q,et al.Survey of multi-label classification based on supervised and semi-supervised learning [J].Computer Science,2022,49(8):12-25.
[18]TAN C,CHEN S,GENG X,et al.A label distribution manifold learning algorithm [J].Pattern Recognition,2023,135:109112.
[19]FAN Y,LIU J,TANG J,et al.Learning correlation information for multi-label feature selection [J].Pattern Recognition,2024,145:109899.
[20]CHEN X,YUAN G,NIE F,et al.Semi-supervised feature selection via sparse rescaled linear square regression [J].IEEE Transactions on Knowledge and Data Engineering,2018,32(1):165-176.
[21]LIN Z,GANESH A,WRIGHT J,et al.Fast convex optimization algorithms for exact recovery of a corrupted low-rank matrix [R].Coordinated Science Laboratory,2009,UILU-ENG-09-2214:DC-24.
[22]ZHANG M L,ZHOU Z H.ML-KNN:A lazy learning approach to multi-label learning [J].Pattern recognition,2007,40(7):2038-2048.
[23]CAI Z,ZHU W.Multi-label feature selection via feature manifold learning and sparsity regularization [J].International Journal of Machine Learning and Cybernetics,2018,9(8):1321-1334.
[24]ZHANG M L,ZHOU Z H.A review on multi-label learning algorithms [J].IEEE transactions on knowledge and data engineering,2013,26(8):1819-1837.
[25]MA Z,NIE F,YANG Y,et al.Discriminating joint feature analysis for multimedia data understanding [J].IEEE Transactions on Multimedia,2012,14(6):1662-1672.
[26]CHANG X,SHEN H,WANG S,et al.Semi-supervised feature analysis for multimedia annotation by mining label correlation [C]//Proceedings of the Pacific-Asia Conference on Advances in Knowledge Discovery and Data Mining.2014:74-85.
[27]CHANG X,NIE F,YANG Y,et al.A convex formulation for semi-supervised multi-label feature selection [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2014:1171-1177.
[1] SHEN Yaxin, GAO Lijian , MAO Qirong. Semi-supervised Sound Event Detection Based on Meta Learning [J]. Computer Science, 2025, 52(3): 222-230.
[2] KANG Wei, LI Lihui, WEN Yimin. Semi-supervised Classification of Data Stream with Concept Drift Based on Clustering Model Reuse [J]. Computer Science, 2024, 51(4): 124-131.
[3] DAI Wei, CHAI Jing, LIU Yajiao. Semi-supervised Learning Algorithm Based on Maximum Margin and Manifold Hypothesis [J]. Computer Science, 2024, 51(2): 259-267.
[4] LI Hui, LI Wengen, GUAN Jihong. Dually Encoded Semi-supervised Anomaly Detection [J]. Computer Science, 2023, 50(7): 53-59.
[5] WANG Qingyu, WANG Hairui, ZHU Guifu, MENG Shunjian. Study on SQL Injection Detection Based on FlexUDA Model [J]. Computer Science, 2023, 50(6A): 220600172-6.
[6] GU Yuhang, HAO Jie, CHEN Bing. Semi-supervised Semantic Segmentation for High-resolution Remote Sensing Images Based on DataFusion [J]. Computer Science, 2023, 50(6A): 220500001-6.
[7] QIN Liang, XIE Liang, CHEN Shengshuang, XU Haijiao. Online Semi-supervised Cross-modal Hashing Based on Anchor Graph Classification [J]. Computer Science, 2023, 50(6): 183-193.
[8] ZHANG Renbin, ZUO Yicong, ZHOU Zelin, WANG Long, CUI Yuhang. Multimodal Generative Adversarial Networks Based Multivariate Time Series Anomaly Detection [J]. Computer Science, 2023, 50(5): 355-362.
[9] LI Haitao, WANG Ruimin, DONG Weiyu, JIANG Liehui. Semi-supervised Network Traffic Anomaly Detection Method Based on GRU [J]. Computer Science, 2023, 50(3): 380-390.
[10] WANG Xiangwei, HAN Rui, Chi Harold LIU. Hierarchical Memory Pool Based Edge Semi-supervised Continual Learning Method [J]. Computer Science, 2023, 50(2): 23-31.
[11] XU Huajie, XIAO Yifeng. Semi-supervised Semantic Segmentation Method Based on Multiple Teacher Network Model [J]. Computer Science, 2023, 50(12): 279-284.
[12] SONG Faxing, MIAO Duoqian, ZHANG Hongyun. Semi-supervised Object Detection with Sequential Three-way Decision [J]. Computer Science, 2023, 50(10): 1-6.
[13] HE Yulin, ZHU Penghui, HUANG Zhexue, Fournier-Viger PHILIPPE. Classification Uncertainty Minimization-based Semi-supervised Ensemble Learning Algorithm [J]. Computer Science, 2023, 50(10): 88-95.
[14] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[15] HOU Xia-ye, CHEN Hai-yan, ZHANG Bing, YUAN Li-gang, JIA Yi-zhen. Active Metric Learning Based on Support Vector Machines [J]. Computer Science, 2022, 49(6A): 113-118.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!