计算机科学 ›› 2021, Vol. 48 ›› Issue (3): 124-129.doi: 10.11896/jsjkx.200700078

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于光滑表示的半监督分类算法

王省1, 康昭2   

  1. 1 电子科技大学信息与软件工程学院 成都610054
    2 电子科技大学计算机科学与工程学院 成都611731
  • 收稿日期:2020-07-11 修回日期:2020-10-01 出版日期:2021-03-15 发布日期:2021-03-05
  • 通讯作者: 康昭(zkang@uestc.edu.cn)
  • 作者简介:wangxing1027@gmail.com
  • 基金资助:
    国家自然科学基金项目(61806045)

Smooth Representation-based Semi-supervised Classification

WANG Xing1 , KANG Zhao2   

  1. 1 School of Information and Software Engineering,University of Electronic Science and Technology of China,Chengdu 610054,China
    2 School of Computer Science and Engineering,University of Electronic Science and Technology of China,Chengdu 611731,China
  • Received:2020-07-11 Revised:2020-10-01 Online:2021-03-15 Published:2021-03-05
  • About author:WANG Xing,born in 2000,undergra-duate.Her main research interests include machine learning,data mining and deep learning.
    KANG Zhao,born in 1983,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include machine lear-ning and pattern recognition.
  • Supported by:
    National Natural Science Foundation of China(61806045).

摘要: 近年来,基于图的半监督分类是机器学习与数据挖掘领域的研究热点之一。该类方法一般通过构造图来挖掘数据中隐含的信息,并利用图的结构信息来对无标签样本进行分类。因此,半监督分类的效果严重依赖于图的质量。文中提出了一种基于光滑表示的半监督分类算法。具体来说,此方法通过应用一个低通滤波器来实现数据的平滑,然后将光滑数据用于半监督分类。此外,所提方法将常见的图构造和标签传播集成到一个统一的优化框架中,使它们互相促进,从而避免低质量图导致的次优解。对人脸和物品数据集进行大量实验,结果表明,所提SRSSC算法在大部分情况下都优于其他算法,从而证明了光滑表示的重要性。

关键词: 半监督分类, 表征学习, 图方法, 相似度计算, 信号过滤

Abstract: Graph-based semi-supervised classification is a hot topic in machine learning and data mining.In general,this method discovers the hidden information by constructing a graph and predicts the labels for unlabeled samples based on the structural information of graph.Thus,the performance of semi-supervised classification heavily depends on the quality of graph.In this work,we propose to perform semi-supervised classification in a smooth representation.In particular,a low-pass filter is applied on the data to achieve a smooth representation,which in turn is used for semi-supervised classification.Furthermore,a unified framework which integrates graph construction and label propagation is proposed,so that they can be mutually improved and avoid the sub-optimal solution caused by low-quality graph.Extensive experiments on face and subject data sets show that the proposed SRSSC outperforms other state-of-the-art methods in most cases,which validates the significance of smooth representation.

Key words: Graph-based method, Representation learning, Semi-supervised classification, Signal filtering, Similarity measure

中图分类号: 

  • TP181
[1]ZHU X J.Semi-supervised learning literature survey[Z].University of Wisconsin-Madison Department of Computer Sciences,2005.
[2]WANG F,LIU J C,WEI I.Semi-supervised Feature Selection Algorithm Based on Information Entropy[J].Computer Science,2019,45(11A):427-430.
[3]ZHU C B,CHEN G,GAO Q.Research on Image Classification Algorithm Based on Semi-supervised Deep Belief Network[J].Computer Science,2018,43(Z6):46-50.
[4]LIN J C,AI H J.Noise Tolerant Label Combination Semi-supervised Learning Algorithm[J].Computer Engineering,2019,45(4):157.
[5]JEBARA T,WANG J,CHANG S.Graph construction and b-matching for semi-supervised learning[C]//Proceedings of the 26th Annual International Conference on Machine Learning.2009:441-448.
[6]CHENG H,LIU Z,YANG J.Sparsity induced similarity measure for label propagation[C]//2009 IEEE 12th International Conference on Computer Vision.2009:317-324.
[7]LI S,FU Y.Learning balanced and unbalanced graphs via low-rank coding[J].IEEE Transactions on Knowledge and Data Engineering,2014,27(5):1274-1287.
[8]WANG F,ZHANG C.Label propagation through linear neighborhoods[J].IEEE Transactions on Knowledge and Data Engineering,2007,20(1):55-67.
[9]BO X,KANG Z,ZHAO Z,et al.Latent Multi-view Semi-Supervised Classification[C]//Proceedings of The Eleventh Asian Conference on Machine Learning.2019:348-362.
[10]KANG Z,WEN L,CHEN W,et al.Low-rank kernel learning for graph-based clustering[J].Knowledge-Based Systems,2019,163:510-517.
[11]NIE F,CAI G,LI X.Multi-view clustering and semi-supervised classification with adaptive neighbours[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017:2408-2414.
[12]KANG Z,GUO Z,HUANG S,et al.Multiple Partitions Aligned Clustering[C]//the 28th International Joint Conference on Artificial Intelligence(IJCAI-19).2019:2701-2707.
[13]ZHU X,GHAHRAMANI Z,LAFFERTY J D.Semi-supervised learning using gaussian fields and harmonic functions[C]//Proceedings of the 20th International Conference on Machine Learning (ICML-03).2003:912-919.
[14]NIE F,WANG H,HUANG H,et al.Unsupervised and semi-supervised learning via ℓ1-norm graph[C]//2011 International Conference on Computer Vision.2011:2268-2273.
[15]GUN N,FANM Y,WANF D I,et al.Semi-supervised classification based on affine subspace sparse representation[J].Entia Sinica,2015,45(8):985-1000.
[16]KANG Z,PAN H,HOI S C H,et al.Robust Graph Learning From Noisy Data[J].IEEE Transactions on Cybernetics,2020,50(5):1833-1843.
[17]ZHOU D,BOUSQUET O,LAL T N,et al.Learning with local and global consistency[C]//Advances in Neural Information Processing Systems.2004:321-328.
[18]KANG Z,LU X,LU Y,et al.Structure learning with similarity preserving[J].Neural Networks,2020,129:138-148.
[19]SHUMAN D I,NARANG S K,FROSSARD P,et al.The emerging field of signal processing on graphs:Extending high-dimensional data analysis to networks and other irregular domains[J].IEEE Signal Processing Magazine,2013,30(3):83-98.
[20]CHUNG F R,GRAHAM F C.Spectral graph theory[M].American Mathematical Society,1997.
[21]LU C,MIN H,ZHAO Z,et al.Robust and efficient subspace segmentation via least squares regression[C]//European Conference on Computer Vision.2012:347-360.
[22]ZHANG X,LIU H,LI Q,et al.Attributed Graph Clustering via Adaptive Graph Convolution[C]//Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence.2019:4327-4333.
[23]NIE F,WANG X,HUANG H.Clustering and projected clustering with adaptive neighbors[C]//Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2014:977-986.
[24]MOHAR B,ALAVI Y,CHARTRAND G,et al.The Laplacian spectrum of graphs[J].Graph theory,Combinatorics and Applications,1991,2(12):871-898.
[25]TERENCE SIM S,BAKER M B.The CMU Pose,Illumination,and Expression (PIE) database[C]//Proceedings of Fifth IEEE International Conference on Automatic Face Gesture Recognition.2002:53-58.
[26]LI C,LIN Z,ZHANG H,et al.Learning semi-supervised representation towards a unified optimization framework for semi-supervised learning[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:2767-2775.
[27]KIPF T N,WELLING M.Semi-Supervised Classification withGraph Convolutional Networks[C]//Proceedings of the 5th International Conference on Learning Representations.2017.
[1] 黄丽, 朱焱, 李春平.
基于异构网络表征学习的作者学术行为预测
Author’s Academic Behavior Prediction Based on Heterogeneous Network Representation Learning
计算机科学, 2022, 49(9): 76-82. https://doi.org/10.11896/jsjkx.210900078
[2] 吴子仪, 李邵梅, 姜梦函, 张建朋.
基于自注意力模型的本体对齐方法
Ontology Alignment Method Based on Self-attention
计算机科学, 2022, 49(9): 215-220. https://doi.org/10.11896/jsjkx.210700190
[3] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[4] 庞兴龙, 朱国胜.
基于半监督学习的网络流量分析研究
Survey of Network Traffic Analysis Based on Semi Supervised Learning
计算机科学, 2022, 49(6A): 544-554. https://doi.org/10.11896/jsjkx.210600131
[5] 王毅, 李政浩, 陈星.
基于用户场景的Android 应用服务推荐方法
Recommendation of Android Application Services via User Scenarios
计算机科学, 2022, 49(6A): 267-271. https://doi.org/10.11896/jsjkx.210700123
[6] 彭云聪, 秦小林, 张力戈, 顾勇翔.
面向图像分类的小样本学习算法综述
Survey on Few-shot Learning Algorithms for Image Classification
计算机科学, 2022, 49(5): 1-9. https://doi.org/10.11896/jsjkx.210500128
[7] 叶洪良, 朱皖宁, 洪蕾.
基于CQT和梅尔频谱的带有人声的音乐风格转换方法
Music Style Transfer Method with Human Voice Based on CQT and Mel-spectrum
计算机科学, 2021, 48(6A): 326-330. https://doi.org/10.11896/jsjkx.200900104
[8] 储杰, 张正军, 汤鑫瑶, 黄振生.
基于加权样本和共识率的标记传播算法
Label Propagation Algorithm Based on Weighted Samples and Consensus-rate
计算机科学, 2021, 48(3): 214-219. https://doi.org/10.11896/jsjkx.191200103
[9] 康雁, 寇勇奇, 谢思宇, 王飞, 张兰, 吴志伟, 李浩.
基于融合变分图注意自编码器的深度聚类模型
Deep Clustering Model Based on Fusion Variational Graph Attention Self-encoder
计算机科学, 2021, 48(11A): 81-87. https://doi.org/10.11896/jsjkx.210300036
[10] 陈迎仁, 郭莹楠, 郭享, 倪一涛, 陈星.
基于特征相似度计算的网页包装器自适应
Web Page Wrapper Adaptation Based on Feature Similarity Calculation
计算机科学, 2021, 48(11A): 218-224. https://doi.org/10.11896/jsjkx.210100230
[11] 方磊, 魏强, 武泽慧, 杜江, 张兴明.
基于神经网络的二进制函数相似性检测技术
Neural Network-based Binary Function Similarity Detection
计算机科学, 2021, 48(10): 286-293. https://doi.org/10.11896/jsjkx.200900185
[12] 束云峰, 王中卿.
基于专利结构的中文专利摘要研究
Research on Chinese Patent Summarization Based on Patented Structure
计算机科学, 2020, 47(6A): 45-48. https://doi.org/10.11896/JsJkx.190500028
[13] 钟雅,郭渊博,刘春辉,李涛.
内部威胁检测中用户属性画像方法与应用
User Attributes Profiling Method and Application in Insider Threat Detection
计算机科学, 2020, 47(3): 292-297. https://doi.org/10.11896/jsjkx.190200379
[14] 蹇松雷, 卢凯.
复杂异构数据的表征学习综述
Survey on Representation Learning of Complex Heterogeneous Data
计算机科学, 2020, 47(2): 1-9. https://doi.org/10.11896/jsjkx.190600180
[15] 许飞翔,叶霞,李琳琳,曹军博,王馨.
基于SA-BP算法的本体概念语义相似度综合计算
Comprehensive Calculation of Semantic Similarity of Ontology Concept Based on SA-BP Algorithm
计算机科学, 2020, 47(1): 199-204. https://doi.org/10.11896/jsjkx.181202351
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!