计算机科学 ›› 2024, Vol. 51 ›› Issue (2): 259-267.doi: 10.11896/jsjkx.221100136

• 人工智能 • 上一篇    下一篇

基于最大间隔和流形假设的半监督学习算法

戴伟, 柴晶, 刘雅娇   

  1. 云南大学信息学院 昆明650500
  • 收稿日期:2022-11-15 修回日期:2023-04-03 出版日期:2024-02-15 发布日期:2024-02-22
  • 通讯作者: 柴晶(jingchai@aliyun.com)
  • 作者简介:(daiwei@mail.ynu.edu.cn)
  • 基金资助:
    国家自然科学基金(62166046);云南省智能系统与计算重点实验室开放课题(ISC23Y01).

Semi-supervised Learning Algorithm Based on Maximum Margin and Manifold Hypothesis

DAI Wei, CHAI Jing, LIU Yajiao   

  1. School of Information Science and Engineering,Yunnan University,Kunming 650500,China
  • Received:2022-11-15 Revised:2023-04-03 Online:2024-02-15 Published:2024-02-22
  • About author:DAI Wei,born in 1994,postgraduate.His main research interests include machine learning and weakly supervised learning.CHAI Jing,born in 1983,Ph.D,asso-ciate professor.His main research intere-sts include machine learning and weakly supervised learning.
  • Supported by:
    National Natural Science Foundation of China(62166046) and Open Project Program of Yunnan Key Laboratory of Intelligent Systems and Computing(ISC23Y01).

摘要: 半监督学习是一种介于监督学习和无监督学习之间的弱监督学习模式,其在学习过程中将少量标记示例和大量未标记示例结合起来构建模型,以期取得比监督学习仅使用标记示例更高的学习精度。在该学习模式下,文中提出了一种将最大间隔准则和示例空间的流形假设思想相结合的半监督学习算法。该算法在利用示例流形结构估计未标记示例标记置信度的同时利用最大间隔准则构建分类模型,并采用交叉优化方法以迭代的方式交替地求解分类模型参数和标记置信度。在12个UCI数据集和4个由MNIST手写数字集生成的数据集上的实验结果表明,采用半监督直推学习方式,该算法的性能优于其他对比算法的情况为60.5%;采用半监督归纳学习方式,该算法的性能优于其他对比算法的情况为42.6%。

关键词: 半监督学习, 最大间隔, 流形假设, 标记置信度, 支持向量机

Abstract: Semi-supervised learning is a weakly supervised learning pattern between supervised learning and unsupervised lear-ning.It combines a small number of labeled instances with a large number of unlabeled instances to build a model during the process of learning,hoping to achieve better learning accuracy than supervised learning using only labeled instances.In this lear-ning pattern,this paper proposes a semi-supervised learning algorithm that combines the maximum margin with manifold hypo-thesis of the instance space.The algorithm utilizes the manifold structure of instances to estimate the labeling confidence over unlabeled instances,at the same time utilizes the maximum margin to derive the classification model.And alternating optimization is adopted to address the quadratic programming problem of the model parameters and the labeling confidence in an iterative manner.On 12 UCI datasets and 4 datasets generated by the MNIST database of handwritten digits,in semi-supervised transductive learning,the proposed algorithm’s performance outperforms the comparison algorithms for 60.5% of the configurations in semi-supervised inductive learning,the proposed algorithm’s performance outperforms the comparison algorithms for 42.6% of the configurations.

Key words: Semi-supervised learning, Maximum margin, Manifold hypothesis, Labeling confidence, Support vector machine

中图分类号: 

  • TP181
[1]VAN ENGELEN J E,HOOS H H.A survey on semi-supervised learning[J].Machine Learning,2020,109:373-440.
[2] CHAPELLE O,SCHOLKOPF B,ZIEN A.Semi-supervisedlearning[M].Cambridge,MA:MIT Press,2006.
[3]ZHU X J.Semi-supervised learning literature survey[M].Madison,USA:Department of Computer Sciences,University of Wisconsin at Madison,2005.
[4]ZHU X J,GOLDBERG A B.Introduction to semi-supervisedlearning[J].Synthesis Lectures on Artificial Intelligence and Machine Learning,2009,3(1):1-130.
[5]CARON M,BOJANOWSKI P,JOULIN A,et al.Deep clustering for unsupervised learning of visual features[C]//Procee-dings of the European Conference on Computer Vision(ECCV 2018) .Berlin:Springer,2018:139-156.
[6]HASTIE T,TIBSHIRANI R,FRIEDMAN J.The elements of Statistical Learning:Data Mining,Inference,and Prediction[M].Berlin:Springer,2019.
[7]FIGUEIREDO M A T.Adaptive sparseness for supervisedlearning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2003,25(9):1150-1159.
[8]LI Y F,ZHOU Z H.Towards making unlabeled data never hurt[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2014,37(1):175-188.
[9]LI M,ZHOU Z H.Improve computer-aided diagnosis with machine learning techniques using undiagnosed samples[J].IEEE Transactions on Systems,Man,and Cybernetics-Part A:Systems and Humans,2007,37(6):1088-1098.
[10]JOACHIMS T.Transductive inference for text classificationusing support vector machines[C]//Proceedings of the 16th International Conference on Machine Learning.San Francisco,USA:Morgan Kaufmann Publishers Inc,1999:200-209.
[11]WANG L,CHAN K L,ZHANG Z.Bootstrapping SVM active learning by incorporating unlabelled images for image retrieval[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2003:629-634.
[12]KASABOV N,PANG S.Transductive support vector machines and applications in bioinformatics for promoter recognition[C]//Proceedings of 2003 International Conference on Neural Networks and Signal Processing.Piscataway,NJ:IEEE 1:1-6.
[13]SINGLA M,GHOSH D,SHUKLA K K.pin-TSVM:A Robust Transductive Support Vector Machine and its Application to the Detection of COVID-19 Infected Patients[J].Neural Processing Letters,2021,53(6):3981-4010.
[14]GOUTTE C,DÉJEAN H,GAUSSIER E,et al.Combining labelled and unlabelled data:a case study on Fisher kernels and transductive inference for biological entity recognition[C]//Proceedings of the 6th Conference on Natural Language Learning.Stroudsburg,PA:ACL,2002:1-7.
[15]KOCKELKORN M,LÜNEBURG A,SCHEFFER T.Usingtransduction and multi-view learning to answer emails[C]//Proceedings of Knowledge Discovery in Databases:PKDD 2003.Berlin:Springer,2003:266-277.
[16]ZHOU D,BOUSQUET O,LAL T,et al.Learning with local and global consistency[J].Advances in Neural Information Processing Systems,2003,16:321-328.
[17]ZHU X,GHAHRAMANI Z.Learning from labeled and unlabeled data with label propagation[R].Pittsburgh,PA:Carnegie Mellon University,Technical Report:CMU-CALD-02-107,2002.
[18]SUN S,HUSSAIN Z,SHAWE-TAYLOR J.Manifold-preser-ving graph reduction for sparse semi-supervised learning[J].Neurocomputing,2014,124:13-21.
[19]ZHU X,GHAHRAMANI Z,LAFFERTY J D.Semi-supervised learning using gaussian fields and harmonic functions[C]//Proceedings of the 20th International Conference on Machine Learning.Menlo Park:AAAI Press,2003:912-919.
[20]ZHU X.Semi-supervised learning with graphs[M].Pittsburgh,PA:Carnegie Mellon University,2005.
[21]CAI X,WEN G,WEI J,et al.Relative manifold based semi-supervised dimensionality reduction[J].Frontiers of Computer Science,2014,8:923-932.
[22]VAPNIK V N.An overview of statistical learning theory[J].IEEE Transactions on Neural Networks,1999,10(5):988-999.
[23]DING S,ZHU Z,ZHANG X.An overview on semi-supervisedsupport vector machine[J].Neural Computing and Applications,2017,28(5):969-978.
[24]LI Y F,GUO L Z,ZHOU Z H.Towards safe weakly supervised learning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(1):334-346.
[25]WANG W,ZHANG M L.Semi-supervised partial label learning via confidence-rated margin maximization[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Associates Inc,2020:6982-6993.
[26]MILLER D J,UYAR H.A mixture of experts classifier withlearning based on both labelled and unlabelled data[J].Advances in Neural Information Processing Systems,1996,9:571-577.
[27]NIGAM K,MCCALLUM A K,THRUN S,et al.Text classification from labeled and unlabeled documents using EM[J].Machine Learning,2000,39(2):103-134.
[28]SHAHSHAHANI B M,LANDGREBE D A.The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon[J].IEEE Transactions on Geoscience and Remote Sensing,1994,32(5):1087-1095.
[29]BELKIN M,NIYOGI P.Laplacian eigenmaps and spectral techniques for embedding and clustering[C]//Proceedings of the 14th International Conference on Neural Information Processing Systems Natural and Synthetic.Cambridge,MA:MIT Press,2001:585-591.
[30]HINTON G E,SALAKHUTDINOV R R.Using deep beliefnets to learn covariance kernels for Gaussian processes[C]//Proceedings of the 20th International Conference on Neural Information Processing Systems.Red Hook,NY:Curran Asso-ciates Inc,2007:20:1249-1256.
[31]COATES A,NG A Y.The importance of encoding versus trai-ning with sparse coding and vector quantization[C]//Procee-dings of the 28th International Conference on International Conference on Machine Learning.Madison.Wisconsin:Omnipress,2011:921-928.
[32]BLUM A,MITCHELL T.Combining labeled and unlabeled data with co-training[C]//Proceedings of the 11th Annual Confe-rence on Computational Learning Theory.New York,NY:ACM,1998:92-100.
[33]NIGAM K,GHANI R.Analyzing the effectiveness and applicability of co-training[C]//Proceedings of the 9th International Conference on Information and Knowledge Management.New York,NY:ACM,2000:86-93.
[34]GOLDMAN S,ZHOU Y.Enhancing supervised learning withunlabeled data[C]//Proceedings of the 17th International Conference on Machine Learning.San Francisco,CA:Morgan Kaufmann Publishers Inc,2000:327-334.
[35]ZHOU Z H,LI M.Tri-training:Exploiting unlabeled data using three classifiers[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(11):1529-1541.
[36]WANG F,ZHANG C.Label propagation through linear neighborhoods[J].IEEE Transactions on Knowledge and Data Engineering,2008,20(1):55-67.
[37]GONG C,LIU T,TAO D,et al.Deformed graph Laplacian for semisupervised learning[J].IEEE Transactions on Neural Networks and Learning Systems,2015,26(10):2261-2274.
[38] CALDER J,COOK B,THORPE M,et al.Poisson learning:Graph based semi-supervised learning at very low label rates[C]//Proceedings of the 37th International Conference on Machine Learning.Clearwater Beach,USA:PMLR,2020:1306-1316.
[39]BENNETT K,DEMIRIZ A.Semi-supervised support vector machines[J].Advances in Neural Information Processing Systems.1998,11:368-374.
[40]LI Y F,KWOK J T,ZHOU Z H.Semi-supervised learning using label mean[C]//Proceedings of the 26th Annual International Conference on Machine Learning.New York,NY:ACM,2009:633-640.
[41]LI Y F,KWOK J,ZHOU Z H.Cost-sensitive semi-supervisedsupport vector machine[C]//Proceedings of the 24th AAAI Conference on Artificial Intelligence.Menlo Park,CA:AAAI Press,2010:500-505.
[42]WANG Q W,LI Y F,ZHOU Z H.Partial Label Learning with Unlabeled Data[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.Menlo Park,CA:AAAI Press,2019:3755-3761.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!