计算机科学 ›› 2021, Vol. 48 ›› Issue (3): 214-219.doi: 10.11896/jsjkx.191200103
储杰, 张正军, 汤鑫瑶, 黄振生
CHU Jie, ZHANG Zheng-jun, TANG Xin-yao, HUANG Zhen-sheng
摘要: 标记传播是使用最广泛的半监督分类方法之一。基于共识率的标记传播算法(Consensus Rate-based Label Propagation,CRLP)通过汇总多个聚类方法以合并数据各种属性得到的共识率来构造图。然而,CRLP算法与大多数基于图的半监督分类方法一样,在图中将每个标记样本视为同等重要,它们主要通过优化图的结构来提高算法的性能。事实上,样本不一定是均匀分布的,不同的样本在算法中的重要性也是不同的,并且CRLP算法容易受聚类数目和聚类方法的影响,对低维数据的适应性不足。针对这些问题,文中提出了一种基于加权样本和共识率的标记传播算法(Label Propagation Algorithm Based on Weighted Samples and Consensus-Rate,WSCRLP)。WSCRLP算法首先对数据集进行多次聚类,以探索样本的结构,并结合共识率和样本的局部信息构造图;然后为不同分布的标记样本分配不同的权重;最后基于构造的图和加权样本进行半监督分类。在真实数据集上的实验表明,WSCRLP算法对标记样本进行加权和构造图的方法可以显著提高分类准确率,在84%的实验中都优于对比方法。相比CRLP算法,WSCRLP算法不仅具有更好的性能,而且对输入参数具有鲁棒性。
中图分类号:
[1]BLUM A,CHAWLA S.Learning from labeled and unlabeleddata using graph mincuts[C]//Processdings of 18th Internatio-nal Conference on Machine Learning.San Francisco:Morgan Kaufman Publishers Inc,2001:19-26. [2]LI J N,ZHU Q S.Semi-Supervised self-training method based on an optimum-path forest[J].IEEE Access,2019,7:36388-36399. [3]GAO Y,MA J,ALAN L,et al.Semi-Supervised sparse representation based classification for face recognition with insufficient labeled samples[J].IEEE Transactions Image Processing,2017,26(5):2545-2560. [4]BELKIN M,NIYOGI P,SINDHWANI V.Manifold Regularization:A geometric framework for learning from labeled and unlabeled examples[J].The Journal of Machine Learning Research,2006,7:2399-2434. [5]ZHU X,GHAHRAMANI Z,LAFFERTYJ D.Semi-supervised learning using Gaussian fields and harmonic functions[C]//Processdings of the Twentieth International Conference on Machine Learning.Washington:AAAI Press,2003:912-919. [6]TAO G H,HUA L Z,WU W,et al.Safety-aware graph-based semi-supervised learning[J].Expert Systems with Applications,2018,107:243-254. [7]WANG J,YAO G J,YU Z W.Semi-supervised classification by discriminative regularization[J].Applied Soft Computing,2017,58:245-255. [8]NIGAM K,MCCALLLUM A K,et al.Text classification from labeled and unlabeled documents using EM[J].Machine Lear-ning,2000,39:103-134. [9]WANG S,WU L,JIAO L,et al.Improve the performance of co-training by committee with refinement of class probability estimations[J].Neurocomputing,2014,136:30-40. [10]HONG Y,ZHU W.Spatial co-training for semi-supervised ima-ge classification[J].Pattern Recognition Letters,2015,63:59-65. [11]LI Y C,WANG Y L,BI C,et al.Revisiting transductive support vector machines with margin distribution embedding[J].Know-ledge-based Systems,2018,152:200-214. [12]JURIC L,CECI M,KOCEV D,et al.Self-training for multi-target regression with tree ensembles[J].Knowledge-based Systems,2017,123:41-60. [13]ZHOU D,BOUSQUET O,LAL T N,et al.Learning with local and global consistency[C]//Proceedings of the Sixteenth Advance in Neural Information Processing Systems.Whistler:MIT Press,2003:321-328. [14]WANG F,ZHANG C.Label propagation through linear neighborhoods[J].IEEE Transactions on Knowledge and Data Engineering,2008,20(1):55-67. [15]ZHAO M,CHOW T W S,ZHANG Z,et al.Automatic image annotation via compact graph based semi-supervised learning[J].Knowledge-Based Systems,2015,76:148-165. [16]WU F,WANG W,YANG Y,et al.Classification by semi-supervised discriminative regularization[J].Neurcomputing,2010,73(10):1641-1651. [17]YU J,SB K.Consensus rate-based label propagation for semi-supervised classification[J].Information Sciences,2018,465:265-284. [18]DAS S,MOORE T,WONG W K,et al.End-user feature labeling:Supervised and semi-supervised approaches based on locally-weighted logistic regression[J].Artificial Intelligence,2013,204:56-74. [19]REN Y,DOMENICONI C,ZHANG G,et al.Weighted-object ensemble clustering[C]//IEEE International Conference on Data Mining.Dallas:IEEE Press,2013:627-636. [20]CHEN X,YU G X,TAN Q Y,et al.Weighted samples based semi-supervised classification[J].Applied Soft Computing,2019,79:46-58. |
[1] | 庞兴龙, 朱国胜. 基于半监督学习的网络流量分析研究 Survey of Network Traffic Analysis Based on Semi Supervised Learning 计算机科学, 2022, 49(6A): 544-554. https://doi.org/10.11896/jsjkx.210600131 |
[2] | 王省, 康昭. 基于光滑表示的半监督分类算法 Smooth Representation-based Semi-supervised Classification 计算机科学, 2021, 48(3): 124-129. https://doi.org/10.11896/jsjkx.200700078 |
[3] | 杨格兰,金辉霞,孟令中,朱幸辉. 基于图的半监督降维算法 Graph-based Semi-supervised Dimensionality Reduction Algorithm 计算机科学, 2014, 41(4): 280-282. |
|