Computer Science ›› 2025, Vol. 52 ›› Issue (6A): 240500129-7.doi: 10.11896/jsjkx.240500129

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Flame Image Enhancement with Few Samples Based on Style Weight Modulation Technique

LI Mingjie, HU Yi, YI Zhengming   

  1. State Key Laboratory of Refractories and Metallurgy,Wuhan University of Science and Technology,Wuhan 430081,China
  • Online:2025-06-16 Published:2025-06-12
  • About author:LI Mingjie,born in 1984,Ph.D.His main research interests include image processing,target recognition,image reconstruction algorithm,flame non-contact temperature measurement and algorithm research.
  • Supported by:
    2023 Innovation and Entrepreneurship Training Program for University Students of Hubei Province(S202310488054).

Abstract: The few samples generation technology relies solely on scarce and limited target samples to generate images that are both fake and diverse,which can build reliable datasets for downstream target recognition tasks.In this paper,we propose a few sample generation model based on weight modulation,which can obtain images with the same content and diverse feature representations as the target samples under the condition of only inputting three target images.Specifically,we have carefully designed the encoder and decoder in the generator,using a C2F structure with better gradient flow to build a pyramid network architecture,maximizing the restoration of the original features of the image at different levels.We adopt a feature fusion method based on attention mechanism and introduced feature style latent codes to control the quality of feature fusion.Among them,the style latent code uses a weight scaling strategy,effectively eliminating generated artifacts and making the generated images more realistic.At the same time,we also use an optimized feature length detection algorithm to detect the proximity of important information in the source and target domains.This technique enables the model to better transfer the prior information obtained through pre-training in the source domain to the target domain.For the task of generating flame image samples,we provide qualitative and quantitative comparison results.The proposed model can effectively improve the flame target recognition performance under the YOLOv8 algorithm and substantially enhance the data augmentation effect.

Key words: Few samples generation, Flame dataset, Feature fusion module, Transfer learning, Pre-training, Weight modulation, Target recognition, Style latent code

CLC Number: 

  • TP391.9
[1]CHATTERJEE R,CHATTERJEE A,ISLAM S K H,et al.An object detection-based few-shot learning approach for multimedia quality assessment[J].Multimedia Systems,2023,29(5):2899-2912.
[2]WANG Z,LI H,ZHANG Z,et al.Attribute-and attention-guided few-shot classification[J].Multimedia Systems,2024,30(1):60.
[3]LIU S,TANG Y,TIAN Y,et al.Visual driving assistance system based on few-shot learning[J].Multimedia Systems,2023,29(5):2853-2863.
[4]ZHANG X C,CHEN P P,XING X L,et al.A data augmentation method built on GPT-2 model[J].CAAI Transactions on Intelligent Systems,2024,19(1):209-216.
[5]ANTONIOU A,STORKEY A,EDWARDS H.Data augmentation generative adversarial networks[J].arXiv:1711.04340,2017.
[6] DUAN Y,HONG Y,NIU L,et al.Few-shot defect image generation via defect-aware feature manipulation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2023:571-578.
[7]JIANG Y C,ZHU B.Data augmentation for remote sensing im-age based on generative adversarial networks under condition of few samples[J].Laser & Optoelectronics Progress,2021,58(8):238-244.
[8]HONG Y,NIU L,ZHANG J,et al.Matchinggan:Matching-based few-shot image generation[C]//2020 IEEE International Conference on Multimedia and Expo(ICME).IEEE,2020:1-6.
[9]LIANG W,LIU Z,LIU C.Dawson:A domain adaptive few shot generation framework[J].arXiv:2001.00576,2020.
[10]GU Z,LI W,HUO J,et al.Lofgan:Fusing local representations for few-shot image generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:8463-8471.
[11]KUMAR A,BHUNIA A K,NARAYAN S,et al.Cross-modula-ted few-shot image generation for colorectal tissue classification[C]//International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer,2023:128-137.
[12]DUAN Y,NIU L,HONG Y,et al.WeditGAN:Few-shot image generation via latent space relocation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:1653-1661.
[13]OJHA U,LI Y,EFROS A A,et al.Few-shot image generation via cross-domain correspondence[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10743-10752.
[14]YANG M,WANG Z,FENG W,et al.Improving Few-shot Image Generation by Structural Discrimination and Textural Modulation[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:7837-7848.
[15]SEO J,KANG J S,PARK G M.LFS-GAN:Lifelong Few-Shot Image Generation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:11356-11366.
[16]WANG Y,GONZALEZ-GARCIA A,BERGA D,et al.Mine-gan:effective knowledge transfer from gans to target domains with few images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:9332-9341.
[17]KONG C,KIM J,HAN D,et al.Few-shot image generation with mixup-based distance learning[C]//European Conference on Computer Vision.Springer,2022:563-580.
[18]ZHANG Z,LIU Y,HAN C,et al.Generalized one-shot domain adaptation of generative adversarial networks[J].Advances in Neural Information Processing Systems,2022,35:13718-13730.
[19]XIAO J,LI L,WANG C,et al.Few shot generative model adaption via relaxed spatial structural alignment[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11204-11213.
[20]ZHOU Y,YUE Z,YE Y,et al.EqGAN:Feature EqualizationFusion for Few-shot Image Generation[J].arXiv:2307.14638,2023.
[21]SHAO K,WANG M Z,WANG G Y.Transformer-based multiscale remote sensing semantic segmentation network [J].CAAI Transactions on Intelligent Systems,2024,19(4):920-929.
[22]ZHANG S L,LEI T,WANG Y B,et al.A crowd counting network based on multi-scale pyramid Transformer[J].CAAI Transactions on Intelligent Systems,2024,19(1):67-78.
[23]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,37(9):1904-1916.
[24]KARRAS T,LAINE S,AITTALA M,et al.Analyzing and improving the image quality of stylegan[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:8110-8119.
[25]ZHAO Y,CHANDRASEGARAN K,ABDOLLAHZADEH M,et al.Few-shot image generation via adaptation-aware kernel modulation[J].Advances in Neural Information Processing Systems,2022,35:19427-19440.
[26]LIU H,ZHANG W,LI B,et al.Improving GAN training viafeature space shrinkage[J].arXiv:2303.01559,2023.
[27]HEUSEL M,RAMSAUER H,UNTERTHINER T,et al.Gans trained by a two time-scale update rule converge to a local nash equilibrium[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.2017:6629-6640.
[28]JOCHER G,CHAURASIA A,QIU J..Ultralytics YOLO(Version 8.0.0) [Computer software[DB/OL].https://github.com/ultralytics/ultralytics,2023.
[29]HU D D,ZHANG Z T.Road target detection algorithm for au-tonomous driving scenarios based on improved YOLOv5s[J].CAAI Transactions on Intelligent Systems,2024,19(3):653-660.
[1] LI Daicheng, LI Han, LIU Zheyu, GONG Shiheng. MacBERT Based Chinese Named Entity Recognition Fusion with Dependent Syntactic Information and Multi-view Lexical Information [J]. Computer Science, 2025, 52(6A): 240600121-8.
[2] HUANG Bocheng, WANG Xiaolong, AN Guocheng, ZHANG Tao. Transmission Line Fault Identification Method Based on Transfer Learning and Improved YOLOv8s [J]. Computer Science, 2025, 52(6A): 240800044-8.
[3] CHEN Qirui, WANG Baohui, DAI Chencheng. Research on Electrocardiogram Classification and Recognition Algorithm Based on Transfer Learning [J]. Computer Science, 2025, 52(6A): 240900073-8.
[4] LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7.
[5] GONG Zian, GU Zhenghui, CHEN Di. Cross-subject Driver Fatigue Detection Based on Local and Global Feature Integrated Network [J]. Computer Science, 2025, 52(6): 200-210.
[6] TIAN Qing, KANG Lulu, ZHOU Liangyu. Class-incremental Source-free Domain Adaptation Based on Multi-prototype Replay andAlignment [J]. Computer Science, 2025, 52(3): 206-213.
[7] ZHANG Yusong, XU Shuai, YAN Xingyu, GUAN Donghai, XU Jianqiu. Survey on Cross-city Human Mobility Prediction [J]. Computer Science, 2025, 52(1): 102-119.
[8] HAN Wei, JIANG Shujuan, ZHOU Wei. Patch Correctness Verification Method Based on CodeBERT and Stacking Ensemble Learning [J]. Computer Science, 2025, 52(1): 250-258.
[9] ZHANG Jian, LI Hui, ZHANG Shengming, WU Jie, PENG Ying. Review of Pre-training Methods for Visually-rich Document Understanding [J]. Computer Science, 2025, 52(1): 259-276.
[10] WANG Jiabin, LUO Junren, ZHOU Yanzhong, WANG Chao, ZHANG Wanpeng. Survey on Event Extraction Methods:Comparative Analysis of Deep Learning and Pre-training [J]. Computer Science, 2024, 51(9): 196-206.
[11] TIAN Qing, LU Zhanghu, YANG Hong. Unsupervised Domain Adaptation Based on Entropy Filtering and Class Centroid Optimization [J]. Computer Science, 2024, 51(7): 345-353.
[12] CAO Yan, ZHU Zhenfeng. DRSTN:Deep Residual Soft Thresholding Network [J]. Computer Science, 2024, 51(6A): 230400112-7.
[13] GUI Haitao, WANG Zhongqing. Personalized Dialogue Response Generation Combined with Conversation State Information [J]. Computer Science, 2024, 51(6A): 230800055-7.
[14] DING Yi, WANG Zhongqing. Study on Pre-training Tasks for Multi-document Summarization [J]. Computer Science, 2024, 51(6A): 230300160-8.
[15] ZHANG Xinrui, YANG Jian, WANG Zhan. Thai Speech Synthesis Based on Cross-language Transfer Learning and Joint Training [J]. Computer Science, 2024, 51(6A): 230500174-7.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!