计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240500129-7.doi: 10.11896/jsjkx.240500129
李明杰, 胡羿, 易正明
LI Mingjie, HU Yi, YI Zhengming
摘要: 少样本图像生成技术仅依靠稀缺有限的目标样本,就能够生成以假乱真和多样化的图像,这可以为下游的目标识别任务构建可靠的数据集。这项工作提出了一种基于权重调制的少样本生成模型,在仅输入3张目标图像的条件下,便能获得与目标样本具有相同内容且特征呈现多样化的图像。具体来说,对生成器中的编码器和解码器经过了精心设计,采用了梯度流更好的C2F结构来搭建金字塔型网络构架,最大程度地还原图像在不同层次的原始特征。采用了基于注意力机制的特征融合方法,引入了特征样式潜码来控制特征融合质量。其中,样式潜码使用了权重缩放的策略,有效地消除了生成伪影,使生成图像更加逼真。同时,还使用了优化的特征长度探测算法来对源域和目标域的重要信息进行接近度探测。这一技巧能够使模型在源域中通过预训练得到的先验信息更好地迁移到目标域中。针对火焰图像样本的生成任务,给出了定性和定量的对比结果,所提出的模型能够切实提高yolov8算法下的火焰目标识别性能,实质性地提升了数据增强的效果。
中图分类号:
[1]CHATTERJEE R,CHATTERJEE A,ISLAM S K H,et al.An object detection-based few-shot learning approach for multimedia quality assessment[J].Multimedia Systems,2023,29(5):2899-2912. [2]WANG Z,LI H,ZHANG Z,et al.Attribute-and attention-guided few-shot classification[J].Multimedia Systems,2024,30(1):60. [3]LIU S,TANG Y,TIAN Y,et al.Visual driving assistance system based on few-shot learning[J].Multimedia Systems,2023,29(5):2853-2863. [4]ZHANG X C,CHEN P P,XING X L,et al.A data augmentation method built on GPT-2 model[J].CAAI Transactions on Intelligent Systems,2024,19(1):209-216. [5]ANTONIOU A,STORKEY A,EDWARDS H.Data augmentation generative adversarial networks[J].arXiv:1711.04340,2017. [6] DUAN Y,HONG Y,NIU L,et al.Few-shot defect image generation via defect-aware feature manipulation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:571-578. [7]JIANG Y C,ZHU B.Data augmentation for remote sensing im-age based on generative adversarial networks under condition of few samples[J].Laser & Optoelectronics Progress,2021,58(8):238-244. [8]HONG Y,NIU L,ZHANG J,et al.Matchinggan:Matching-based few-shot image generation[C]//2020 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2020:1-6. [9]LIANG W,LIU Z,LIU C.Dawson:A domain adaptive few shot generation framework[J].arXiv:2001.00576,2020. [10]GU Z,LI W,HUO J,et al.Lofgan:Fusing local representations for few-shot image generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:8463-8471. [11]KUMAR A,BHUNIA A K,NARAYAN S,et al.Cross-modula-ted few-shot image generation for colorectal tissue classification[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer,2023:128-137. [12]DUAN Y,NIU L,HONG Y,et al.WeditGAN:Few-shot image generation via latent space relocation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:1653-1661. [13]OJHA U,LI Y,EFROS A A,et al.Few-shot image generation via cross-domain correspondence[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10743-10752. [14]YANG M,WANG Z,FENG W,et al.Improving Few-shot Image Generation by Structural Discrimination and Textural Modulation[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:7837-7848. [15]SEO J,KANG J S,PARK G M.LFS-GAN:Lifelong Few-Shot Image Generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:11356-11366. [16]WANG Y,GONZALEZ-GARCIA A,BERGA D,et al.Mine-gan:effective knowledge transfer from gans to target domains with few images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9332-9341. [17]KONG C,KIM J,HAN D,et al.Few-shot image generation with mixup-based distance learning[C]//European Conference on Computer Vision.Springer,2022:563-580. [18]ZHANG Z,LIU Y,HAN C,et al.Generalized one-shot domain adaptation of generative adversarial networks[J].Advances in Neural Information Processing Systems,2022,35:13718-13730. [19]XIAO J,LI L,WANG C,et al.Few shot generative model adaption via relaxed spatial structural alignment[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11204-11213. [20]ZHOU Y,YUE Z,YE Y,et al.EqGAN:Feature EqualizationFusion for Few-shot Image Generation[J].arXiv:2307.14638,2023. [21]SHAO K,WANG M Z,WANG G Y.Transformer-based multiscale remote sensing semantic segmentation network [J].CAAI Transactions on Intelligent Systems,2024,19(4):920-929. [22]ZHANG S L,LEI T,WANG Y B,et al.A crowd counting network based on multi-scale pyramid Transformer[J].CAAI Transactions on Intelligent Systems,2024,19(1):67-78. [23]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916. [24]KARRAS T,LAINE S,AITTALA M,et al.Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8110-8119. [25]ZHAO Y,CHANDRASEGARAN K,ABDOLLAHZADEH M,et al.Few-shot image generation via adaptation-aware kernel modulation[J].Advances in Neural Information Processing Systems,2022,35:19427-19440. [26]LIU H,ZHANG W,LI B,et al.Improving GAN training viafeature space shrinkage[J].arXiv:2303.01559,2023. [27]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.Gans trained by a two time-scale update rule converge to a local nash equilibrium[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6629-6640. [28]JOCHER G,CHAURASIA A,QIU J..Ultralytics YOLO(Version 8.0.0) [Computer software[DB/OL].https://github.com/ultralytics/ultralytics,2023. [29]HU D D,ZHANG Z T.Road target detection algorithm for au-tonomous driving scenarios based on improved YOLOv5s[J].CAAI Transactions on Intelligent Systems,2024,19(3):653-660. |
|