Computer Science ›› 2022, Vol. 49 ›› Issue (3): 288-293.doi: 10.11896/jsjkx.210100156

• Artificial Intelligence • Previous Articles     Next Articles

Semi-supervised Learning Method Based on Automated Mixed Sample Data Augmentation Techniques

XU Hua-jie1,2, CHEN Yu1, YANG Yang1, QIN Yuan-zhuo3   

  1. 1 College of Computer and Electronic Information,Guangxi University,Nanning 530004,China
    2 Guangxi Key Laboratory of Multimedia Communications and Network Technology,Nanning 530004,China
    3 College of Civil Engineering and Architecture,Guangxi University,Nanning 530004,China
  • Received:2021-01-20 Revised:2021-07-07 Online:2022-03-15 Published:2022-03-15
  • About author:XU Hua-jie,born in 1974,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include artificial intelligence,acoustic signal recognition and computer vision.
    QIN Yuan-zhuo,born in 1996,doctoral candidate.His main research interests include artificial intelligence and computer vision and their applications in engineering.
  • Supported by:
    Science and Technology Plan Project of Guangxi Zhuang Autonomous Region(2017AB15008) and Science and Technology Plan Project of Chongzuo(FB2018001).

Abstract: Consistency-based semi-supervised learning methods typically use simple data augmentation methods to achieve consistent predictions for both original inputs and perturbed inputs.The effectiveness of this approach is difficult to be guaranteed when the proportion of labeled data is relatively low.Extending some advanced data augmentation method in supervised learning to be used in a semi-supervised learning setting is one of the ideas to solve this problem.Based on the consistency-based semi-supervised learning method MixMatch,a semi-supervised learning method AutoMixMatch based on automated mixed sample data augmentation techniques is proposed,which uses a modified automatic data augmentation technique in the data augmentation phase,and a mixed-sample algorithm is proposed to enhance the utilization of unlabeled samples in the sample mixing phase.The performance of the proposed method is evaluated through image classification experiments.In image classification benchmark datasets,the proposed method outperforms several mainstream semi-supervised classification methods in three labeled sample proportions,which validates the effectiveness of the method.In addition,the proposed method performs better with a very low proportion of labeled data to the training data (only 0.05%),and the classification error rate of the proposed method on the SVHN dataset is 30.17% lower than that of MixMatch.

Key words: Automated data augmentation, Consistency, Image classification, Mixed sample, Semi-supervised learning

CLC Number: 

  • TP391
[1]CHAPELLE O,SCHOLKOPF B,ZIEN A.Semi-supervisedlearning (chapelle,o.et al.,eds.;2006)[book reviews][J].IEEE Transactions on Neural Networks,2009,20(3):542.
[2]LAINE S,AILA T.Temporal Ensembling for Semi-Supervised Learning[C]//Proceedings of the International Conference on Learning Representations (ICLR).2017.
[3]TARVAINEN A,VALPOLA H.Mean teachers are better role models:Weight-averaged consistency targets improve semi-supervised deep learning results[C]//Advances in Neural Information Processing Systems.2017:1195-1204.
[4]VERMA V,LAMB A,KANNALA J,et al.Interpolation Con-sistency Training for Semi-supervised Learning[C]//Procee-dings of the 28th International Joint Conference on Artificial Intelligence.AAAI Press,2019:3635-3641.
[5]ZHANG H,CISSE M,DAUPHIN Y N,et al.mixup:BeyondEmpirical Risk Minimization[J].arXiv:1710.09412,2017.
[6]XIE Q,DAI Z,HOVY E,et al.Unsupervised Data Augmentation for Consistency Training[J].arXiv:1904.12848,2019.
[7]CUBUK E D,ZOPH B,SHLENS J,et al.RandAugment:Practical data augmentation with no separate search[J].arXiv:1909.13719,2019.
[8]BERTHELOT D,CARLINI N,GOODFELLOW I,et al.Mix-Match:A Holistic Approach to Semi-Supervised Learning[J].arXiv:1905.02249,2019.
[9]CUBUK E D,ZOPH B,MANE D,et al.AutoAugment:Lear-ning Augmentation Strategies From Data[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:113-123.
[10]YUN S,HAN D,OH S J,et al.CutMix:Regularization Strategy to Train Strong Classifiers with Localizable Features[C]//International Conference on Computer Vision (ICCV).2019.
[11]QIN Y,DING S F.Survey of Semi-supervised Clustering[J].Computer Science,2019,46(9):15-21.
[12]ATHIWARATKUN B,FINZI M,IZMAILOV P,et al.There Are Many Consistent Explanations of Unlabeled Data:Why You Should Average[C]//Proceedings of the International Confe-rence on Learning Representations (ICLR).2019.
[13]IZMAILOV P,PODOPRIKHIN D,GARIPOV T,et al.Averaging weights leads to wider optima and better generalization[J].arXiv:1803.05407,2018.
[14]MIYATO T,MAEDA S,KOYAMA M,et al.Virtual adversa-rial training:a regularization method for supervised and semi-supervised learning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,41(8):1979-1993.
[15]FRENCH G,AILA T,LAINE S,et al.Semi-supervised semantic segmentation needs strong,high-dimensional perturbations[J].arXiv:1906.01916,2019.
[1] WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[2] ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[3] YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[4] DU Li-jun, TANG Xi-lu, ZHOU Jiao, CHEN Yu-lan, CHENG Jian. Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning [J]. Computer Science, 2022, 49(6A): 60-65.
[5] HOU Xia-ye, CHEN Hai-yan, ZHANG Bing, YUAN Li-gang, JIA Yi-zhen. Active Metric Learning Based on Support Vector Machines [J]. Computer Science, 2022, 49(6A): 113-118.
[6] WANG Yu-fei, CHEN Wen. Tri-training Algorithm Based on DECORATE Ensemble Learning and Credibility Assessment [J]. Computer Science, 2022, 49(6): 127-133.
[7] ZHU Xu-dong, XIONG Yun. Study on Multi-label Image Classification Based on Sample Distribution Loss [J]. Computer Science, 2022, 49(6): 210-216.
[8] JIN Li-zhen, LI Qing-zhong. Fast Structural Texture Image Synthesis Algorithm Based on Seam ConsistencyCriterion [J]. Computer Science, 2022, 49(6): 262-268.
[9] ZHANG Wen-xuan, WU Qin. Fine-grained Image Classification Based on Multi-branch Attention-augmentation [J]. Computer Science, 2022, 49(5): 105-112.
[10] PENG Yun-cong, QIN Xiao-lin, ZHANG Li-ge, GU Yong-xiang. Survey on Few-shot Learning Algorithms for Image Classification [J]. Computer Science, 2022, 49(5): 1-9.
[11] DONG Lin, HUANG Li-qing, YE Feng, HUANG Tian-qiang, WENG Bin, XU Chao. Survey on Generalization Methods of Face Forgery Detection [J]. Computer Science, 2022, 49(2): 12-30.
[12] LIU Yi, MAO Ying-chi, CHENG Yang-kun, GAO Jian, WANG Long-bao. Locality and Consistency Based Sequential Ensemble Method for Outlier Detection [J]. Computer Science, 2022, 49(1): 146-152.
[13] CHEN Tian-rong, LING Jie. Differential Privacy Protection Machine Learning Method Based on Features Mapping [J]. Computer Science, 2021, 48(7): 33-39.
[14] ZHAO Xiao, LI Shi-lin, LI Fan, YU Zheng-tao, ZHANG Lin-hua, YANG Yong. Double-cycle Consistent Insulator Defect Sample Generation Method Based on Local Fine-grainedInformation Guidance [J]. Computer Science, 2021, 48(6A): 581-586.
[15] HU Jing-hui, XU Peng. Automatic Classification of Aviation Fastener Products Based on Image Classification [J]. Computer Science, 2021, 48(6A): 63-66.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!