计算机科学 ›› 2025, Vol. 52 ›› Issue (6A): 240500129-7.doi: 10.11896/jsjkx.240500129

• 图像处理&多媒体技术 • 上一篇    下一篇

基于样式权重调制技术的少样本火焰图像增强

李明杰, 胡羿, 易正明   

  1. 武汉科技大学省部共建耐火材料与冶金国家重点实验室 武汉 430081
  • 出版日期:2025-06-16 发布日期:2025-06-12
  • 通讯作者: 李明杰(mingjie_li@wust.edu.cn)
  • 基金资助:
    湖北省大学生2023创新创业训练计划(S202310488054)

Flame Image Enhancement with Few Samples Based on Style Weight Modulation Technique

LI Mingjie, HU Yi, YI Zhengming   

  1. State Key Laboratory of Refractories and Metallurgy,Wuhan University of Science and Technology,Wuhan 430081,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:LI Mingjie,born in 1984,Ph.D.His main research interests include image processing,target recognition,image reconstruction algorithm,flame non-contact temperature measurement and algorithm research.
  • Supported by:
    2023 Innovation and Entrepreneurship Training Program for University Students of Hubei Province(S202310488054).

摘要: 少样本图像生成技术仅依靠稀缺有限的目标样本,就能够生成以假乱真和多样化的图像,这可以为下游的目标识别任务构建可靠的数据集。这项工作提出了一种基于权重调制的少样本生成模型,在仅输入3张目标图像的条件下,便能获得与目标样本具有相同内容且特征呈现多样化的图像。具体来说,对生成器中的编码器和解码器经过了精心设计,采用了梯度流更好的C2F结构来搭建金字塔型网络构架,最大程度地还原图像在不同层次的原始特征。采用了基于注意力机制的特征融合方法,引入了特征样式潜码来控制特征融合质量。其中,样式潜码使用了权重缩放的策略,有效地消除了生成伪影,使生成图像更加逼真。同时,还使用了优化的特征长度探测算法来对源域和目标域的重要信息进行接近度探测。这一技巧能够使模型在源域中通过预训练得到的先验信息更好地迁移到目标域中。针对火焰图像样本的生成任务,给出了定性和定量的对比结果,所提出的模型能够切实提高yolov8算法下的火焰目标识别性能,实质性地提升了数据增强的效果。

关键词: 少样本生成, 火焰数据集, 特征融合模块, 迁移学习, 预训练, 权重调制, 目标识别, 样式潜码

Abstract: The few samples generation technology relies solely on scarce and limited target samples to generate images that are both fake and diverse,which can build reliable datasets for downstream target recognition tasks.In this paper,we propose a few sample generation model based on weight modulation,which can obtain images with the same content and diverse feature representations as the target samples under the condition of only inputting three target images.Specifically,we have carefully designed the encoder and decoder in the generator,using a C2F structure with better gradient flow to build a pyramid network architecture,maximizing the restoration of the original features of the image at different levels.We adopt a feature fusion method based on attention mechanism and introduced feature style latent codes to control the quality of feature fusion.Among them,the style latent code uses a weight scaling strategy,effectively eliminating generated artifacts and making the generated images more realistic.At the same time,we also use an optimized feature length detection algorithm to detect the proximity of important information in the source and target domains.This technique enables the model to better transfer the prior information obtained through pre-training in the source domain to the target domain.For the task of generating flame image samples,we provide qualitative and quantitative comparison results.The proposed model can effectively improve the flame target recognition performance under the YOLOv8 algorithm and substantially enhance the data augmentation effect.

Key words: Few samples generation, Flame dataset, Feature fusion module, Transfer learning, Pre-training, Weight modulation, Target recognition, Style latent code

中图分类号: 

  • TP391.9
[1]CHATTERJEE R,CHATTERJEE A,ISLAM S K H,et al.An object detection-based few-shot learning approach for multimedia quality assessment[J].Multimedia Systems,2023,29(5):2899-2912.
[2]WANG Z,LI H,ZHANG Z,et al.Attribute-and attention-guided few-shot classification[J].Multimedia Systems,2024,30(1):60.
[3]LIU S,TANG Y,TIAN Y,et al.Visual driving assistance system based on few-shot learning[J].Multimedia Systems,2023,29(5):2853-2863.
[4]ZHANG X C,CHEN P P,XING X L,et al.A data augmentation method built on GPT-2 model[J].CAAI Transactions on Intelligent Systems,2024,19(1):209-216.
[5]ANTONIOU A,STORKEY A,EDWARDS H.Data augmentation generative adversarial networks[J].arXiv:1711.04340,2017.
[6] DUAN Y,HONG Y,NIU L,et al.Few-shot defect image generation via defect-aware feature manipulation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:571-578.
[7]JIANG Y C,ZHU B.Data augmentation for remote sensing im-age based on generative adversarial networks under condition of few samples[J].Laser & Optoelectronics Progress,2021,58(8):238-244.
[8]HONG Y,NIU L,ZHANG J,et al.Matchinggan:Matching-based few-shot image generation[C]//2020 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2020:1-6.
[9]LIANG W,LIU Z,LIU C.Dawson:A domain adaptive few shot generation framework[J].arXiv:2001.00576,2020.
[10]GU Z,LI W,HUO J,et al.Lofgan:Fusing local representations for few-shot image generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:8463-8471.
[11]KUMAR A,BHUNIA A K,NARAYAN S,et al.Cross-modula-ted few-shot image generation for colorectal tissue classification[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer,2023:128-137.
[12]DUAN Y,NIU L,HONG Y,et al.WeditGAN:Few-shot image generation via latent space relocation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:1653-1661.
[13]OJHA U,LI Y,EFROS A A,et al.Few-shot image generation via cross-domain correspondence[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10743-10752.
[14]YANG M,WANG Z,FENG W,et al.Improving Few-shot Image Generation by Structural Discrimination and Textural Modulation[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:7837-7848.
[15]SEO J,KANG J S,PARK G M.LFS-GAN:Lifelong Few-Shot Image Generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:11356-11366.
[16]WANG Y,GONZALEZ-GARCIA A,BERGA D,et al.Mine-gan:effective knowledge transfer from gans to target domains with few images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9332-9341.
[17]KONG C,KIM J,HAN D,et al.Few-shot image generation with mixup-based distance learning[C]//European Conference on Computer Vision.Springer,2022:563-580.
[18]ZHANG Z,LIU Y,HAN C,et al.Generalized one-shot domain adaptation of generative adversarial networks[J].Advances in Neural Information Processing Systems,2022,35:13718-13730.
[19]XIAO J,LI L,WANG C,et al.Few shot generative model adaption via relaxed spatial structural alignment[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11204-11213.
[20]ZHOU Y,YUE Z,YE Y,et al.EqGAN:Feature EqualizationFusion for Few-shot Image Generation[J].arXiv:2307.14638,2023.
[21]SHAO K,WANG M Z,WANG G Y.Transformer-based multiscale remote sensing semantic segmentation network [J].CAAI Transactions on Intelligent Systems,2024,19(4):920-929.
[22]ZHANG S L,LEI T,WANG Y B,et al.A crowd counting network based on multi-scale pyramid Transformer[J].CAAI Transactions on Intelligent Systems,2024,19(1):67-78.
[23]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[24]KARRAS T,LAINE S,AITTALA M,et al.Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8110-8119.
[25]ZHAO Y,CHANDRASEGARAN K,ABDOLLAHZADEH M,et al.Few-shot image generation via adaptation-aware kernel modulation[J].Advances in Neural Information Processing Systems,2022,35:19427-19440.
[26]LIU H,ZHANG W,LI B,et al.Improving GAN training viafeature space shrinkage[J].arXiv:2303.01559,2023.
[27]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.Gans trained by a two time-scale update rule converge to a local nash equilibrium[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6629-6640.
[28]JOCHER G,CHAURASIA A,QIU J..Ultralytics YOLO(Version 8.0.0) [Computer software[DB/OL].https://github.com/ultralytics/ultralytics,2023.
[29]HU D D,ZHANG Z T.Road target detection algorithm for au-tonomous driving scenarios based on improved YOLOv5s[J].CAAI Transactions on Intelligent Systems,2024,19(3):653-660.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!