计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 256-263.doi: 10.11896/jsjkx.240600123

• 计算机图形学&多媒体 • 上一篇    下一篇

基于显著性掩模混合的小样本图像分类

陈亚当1, 高宇轩1, 卢楚翰1, 车洵2   

  1. 1 南京信息工程大学计算机学院 南京 210044
    2 南京理工大学计算机科学与工程学院 南京 210094
  • 收稿日期:2024-06-20 修回日期:2024-11-21 出版日期:2025-06-15 发布日期:2025-06-11
  • 通讯作者: 车洵(chexun@njust.edu.cn)
  • 作者简介:(adamchen@nuist.edu.cn)
  • 基金资助:
    国家自然科学基金(62473201,62477026);江苏省重点研发计划产业前瞻与关键核心技术项目(BE2022161);无锡市产业创新研究院先导技术预研项目

Saliency Mask Mixup for Few-shot Image Classification

CHEN Yadang1, GAO Yuxuan1, LU Chuhan1, CHE Xun2   

  1. 1 School of Computer Science,Nanjing University of Information Science and Technology,Nanjing 210044,China
    2 School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
  • Received:2024-06-20 Revised:2024-11-21 Online:2025-06-15 Published:2025-06-11
  • About author:CHEN Yadang,born in 1985,Ph.D,associate professor.His main research interests include video segmentation,vi-deo enhancement,video editing and augmented reality.
    CHE Xun,born in 1985,doctoral student,professor,is a member of CCF(No.J8696S).His main research in-terest is model robustness and security.
  • Supported by:
    National Natural Science Foundation of China(62473201,62477026),Jiangsu Province Key R&D Program Industry Outlook and Key Core Technology Projects(BE2022161) and Wuxi Industrial Innovation Research Institute-Pionearing Technology Pre-Research Project.

摘要: 小样本图像分类解决了传统图像分类在数据量不足时表现不佳的问题,其难点在于如何充分利用稀缺的样本标签数据预测真实的特征分布。一些最新方法采用随机遮挡或混合插值等数据增强方法来提高数据标签样本的多样性和泛化性,但仍然存在以下问题:1)随机遮挡具有不确定性,会出现完全遮挡或暴露前景的情况,导致样本关键信息丢失;2)由于混合插值后的数据分布过于平均,模型难以准确区分不同类别之间的差异和边界。针对上述问题,提出一种基于显著性掩模混合的数据增强方法。首先,通过视觉特征隐蔽融合和置信度裁剪选择策略,对图像关键特征信息进行自适应的筛选与保留;其次,采用视觉特征显著性融合方法,计算出图片中各个区域的重要性,引导图片融合,增加所得图片的多样性和丰富性,使类别边界更加清晰。所提方法在多个标准小样本图像分类数据集(miniImageNet,tieredImageNet,Few-shot CIFAR100和Caltech-UCSD Birds-200)上表现出色,优于最先进方法约0.2%~1%,在小样本图像分类中具有显著的潜力和优势。

关键词: 小样本学习, 图像分类, 对比学习, 数据混合, 数据增强, 显著图

Abstract: Few-shot image classification addresses the problem of poor performance in traditional image classification when data is scarce.The challenge lies in effectively utilizing sparse sample label data to predict the true feature distribution.To tackle this,some recent methods adopt data augmentation techniques such as random mas-king or mixed interpolation to enhance the diversity and generalization of data label samples.However,there are still the following issues:1)Due to the uncertainty of random masking,situations where the foreground is either completely masked or exposed may occur,leading to the loss of crucial information in samples;2)Because the data distribution after mixed interpolation tends to be overly uniform,models find it difficult to accurately distinguish differences between different classes,thus failing to effectively delineate boundaries between different categories.To address these problems,this paper proposes a data augmentation method based on Saliency Mask Mixup.Firstly,through Mask Mix(M-Mix) and Confident Clip Selector(CCS),adaptive selection and retention of key feature information in images are performed.Secondly,using Saliency Fuse(SF),the importance of various regions in the image is calculated to guide image fusion,making the resulting images more diverse and rich,thereby making category boundaries clearer.The proposed method demonstrates outstanding performance on multiple standard few-shot image classification datasets(such as miniImage-Net,tiered-ImageNet,Few-shot CIFAR100,and Caltech-UCSD Birds-200),outperforming state-of-the-art methods by approximately 0.2~1%.These results indicate significant potential and advantages of the proposed method in few-shot image classification.

Key words: Few-shot learning, Image classification, Contrastive learning, Date mixing, Data augmentation, Saliency map

中图分类号: 

  • TP391
[1]CHEN Y,LIU Z,XU H,et al.Meta-Baseline:E-xploring simple meta-learning for few-shot le-arning[C]//2021 IEEE/CVF International Conference on Computer Vision(ICCV).2021:9042-9051.
[2]PADMANABHAN D,GOWDA S,ARANI E,et al.Ls-fsl:Le-veraging shape information in few-shot learning[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).2023:4971-4980.
[3]QIAO Q,XIE Y,ZENG Z Y,et al.Talds-net:Task aware adaptive local descriptors selection for few-shot image classification[J].arXiv:2312.05449,2023.
[4]SNELL J,SWERSKY K,ZEMEL R.Prototypical networks for few-shot learning[J].arXiv:1703.05275.2017.
[5]ZHANG C,CAI Y,LIN G,et al.DeepEMD:Differentiable earth mover's distance for few-shot learning[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(5):5632-5648.
[6]CHEN H,LI H,LI Y,et al.Multi-level metric learning for few-shot image recognition[C]//International Conference on Artificial Neural Networks.Cham:Springer International Publishing,2022:243-254.
[7]DENG G L,HUANG G H,CHEN Z Y.Category DecoupledFew-Shot Classifi-cation for Graph Neural Network[J].Computer Engineering and Applications,2024,60(2):129-136.
[8]LIU C,FU Y,XU C,et al.Learning a few-shot embedding modelwith contrastive learning[C]//Procee-dings of the AAAI Conference On Arti-Ficial Intelligence.2021:8635-8643.
[9]MANGLA P,KUMARI N,SINHA A,et al.Charting the right manifold:Manifold mixup for few-shot learning[C]//Procee-dings of the IEEE/CVF Winter Conference on Applications of Co-Mputer Vision.2020:2218-2227.
[10]ZHUO L,FU Y,CHEN J, et al. Tgdm: Target guided dynamic mixup for cross-domain few-shot learning[C]//Proceedings of the 30th ACM International Conference on Multimedia.2022:6368-6376.
[11]ZHANG H Y,CISSE M,DAUPHIN Y N,et al.mixup:Beyond empirical risk minimization[J].arXiv:1710.09412,2017.
[12]YUN S,HAN D,OH S J,et al.Cutmix:Regular-ization strategy to train strong classifiers with localizable features[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision.2019:6023-6032.
[13]KIM J H,CHOO W,SONG H O.Puzzle mix:Exploiting saliency and local statistics for optimal mixup[C]//International Conference on Machine Learning.PMLR,2020:5275-5285.
[14]PENG T,FENG L,DU Y D,et al.Meta-cosine loss for few-shot image classification[J].Journal of Image and Graphics,2024,29(2):506-519.
[15]GUO L,LIU B,LI W G,ea al.A Few-Shot Image Classification Method by Hard Pairwise-Based Excitation[J].Journal of Computer-Aided Design & Computer Graphics,2024,26(6):895-903.
[16]KANG D,KWON H,MIN J,et al.Relational em-bedding for few-shot classification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:8822-8833.
[17]AFRASIYABI A,LALONDE J F,GAGNÉ C.Mixture-based feature space learning for few-shot image classification[C]//Proceedings of the IEEE/CVF International Conference on Computer ViSion.2021:9041-9051.
[18]YANG Z Y,WANG J H,ZHU Y Y.Few-shot classificationwith contrastive learning[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2022:293-309.
[19]YANG L,LI L,ZHANG Z,et al.Dpgn:Distribution propagation graph network for few-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and PatternRe-cognition.2020:13390-13399.
[20]YANG S,LIU L,XU M.Free lunch for few-shot learning:Distribution calibration[J].arXiv:2101.06395,2021.
[21]AFRASIYABI A,LALONDE J F,GAGNÉ C.Associative alignment for few-shot image classification[C]//Computer Vision-ECCV 2020:16th European Conference,Glasgow,UK,August 23-28,2020,Proceedings,Part V 16.Springer International Publishing,2020:18-35.
[22]ZIKO I,DOLZ J,GRANGER E,et al.Laplacian regularized few-shot learning[C]//International Conference on Machine Learning.PMLR,2020:11660-11670.
[23]AFRASIYABI A,LAROCHELLE H,LALONDE J F,et al.Matching feature sets for few-shot image classification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:9014-9024.
[24]HE K M,FAN H Q,WU Y X,et al.Momentum contrast for unsupervised visual representation learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9729-9738.
[25]HENDRYCKS D,MU N,CUBUK E D,et al.Augmix:A simple data processing method to improve robustness and uncertainty[J].arXiv:1912.02781,2019.
[26]CHEN Z,FU Y,WANG Y X,et al.Image deformation meta-networks for one-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:8680-8689.
[27]VERMA V,LAMB A,BECKHAM C,et al.Manifold mixup:Better representations by interpolating hidden states[C]//International Conference on Machine Learning.PMLR,2019:6438-6447.
[28]YANG T H,GU Z H,MA L Z.Style-aware Cross Domain Few-Shot Anomaly Detection[J/OL].http://kns.cnki.net/kcms/detail/11.2925.TP.20240926.1526.003.html.
[29]CHEN Y D,ZHAO Y B,WU E H,et al.Robust Semi-supervised Video Object Segmentation with Dynamic Embedding[J].https://link.cnki.net/doi/10.13700/j.bh.1001-5965.2023.0354
[30]LIU H,TIAN Z,QIU J,et al.Survey on Few-shot for Malware Detection [J].Journal of Software,2024,35(8):3785-3808.
[31]LI F,JIA D L,YAO Y M,et al.Graph Neural Network Few Shot Image Classification Network Based on Residual and Self-attention Mechanism [J].Computer Science,2023,50(S1):276-380.
[32]CHEN Y D,HAO C Y,YANG Z X,et al.Fast Target-aware Learning for Few-shot Video Object Segmentation[J].SCIENCE CHINA Information Sciences,2022,65(8):182104.
[33]CHEN Y D,JIANG R,ZHENG Y H,et al.Dual branch multi-level semantic learning for few-shot segmentation[J].IEEE Transactions on Image Processing,2024,33:1432-1447.
[34]CHEN Y,CHEN S,YANG Z X,et al.Learning self-targetknowledge for few-shot segmentation[J].Pattern Recognition,2024,149:110266.
[35]CHEN Y D,CHEN L R,YU W B,et al.Knowledge Distillation Anomaly Detection with Multi-Scale Feature Fusion[J].Journal of Computer-Aided Design & Computer Graphics,2022,34(10):1542-1549.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!