Computer Science ›› 2026, Vol. 53 ›› Issue (4): 260-268.doi: 10.11896/jsjkx.250700172

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Unsupervised Infrared Image Generation Method Based on Dual Semantic Contrastive Learning

CHENG Zimeng1,2, YANG Xinyue1,2, AI Haojun1,2, WANG Zhongyuan3   

  1. 1 School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China
    2 Key Laboratory of Aerospace Information Security and Trusted Computing, Ministry of Education, Wuhan University, Wuhan 430072, China
    3 School of Computer Science, Wuhan University, Wuhan 430079, China
  • Received:2025-07-25 Revised:2025-09-22 Online:2026-04-15 Published:2026-04-08
  • About author:CHENG Zimeng,born in 2002,postgraduate,is a member of CCF(No.02769G).Her main research interests include computer vision and image-to-image translation.
    AI Haojun,born in 1972.Ph.D,associate professor,is a senior member of CCF(No.06059S).His main research interests include computer vision,artificial intelligence and deepfake detection.
  • Supported by:
    Hubei Province International Science and Technology Collaboration Program(2025EHA043).

Abstract: Infrared images are widely used in computer vision,but high-quality infrared image datasets are limited in scale due to restricted acquisition conditions.To address this problem,converting visible datasets to infrared datasets has become an effective way.Existing generation methods generally rely on supervised learning,which requires a large amount of paired data that is extremely difficult to obtain in practical applications.This paper proposes an unsupervised infrared image generation method named DSCGAN.This method adopts a bidirectional transformation architecture and introduces semantic contrast learning to enhance the ability to preserve image content and learn discriminative infrared features.The geometric consistency loss is introduced to preserve the original structure and details of visible images effectively.Meanwhile,a multi-scale PatchGAN discriminator is constructed to improve discriminative capability and enhance the realism of generated images.Experimental results on the AVIID-1,AVIID-2,and Day-DroneVehicle datasets show that DSCGAN outperforms the comparison methods in several metrics,and the generated infrared images exhibit a more reasonable thermal radiation distribution and better visual quality.In the AVIID-1 dataset,the SSIM value increases to 0.814 4,and the FID score decreases to 0.145 6.In the Day-DroneVehicle dataset,the PSNR value improves to 18.14,while the LPIPS value drops to 0.294 9.This study provides a new idea for unsupervised infrared image gene-ration,with potential applications in infrared target detection,infrared scene segmentation,and other downstream tasks.

Key words: Image-to-image translation, Semantic contrastive learning, Infrared image generation, Multi-scale discriminator, Geometric consistency constraint

CLC Number: 

  • TP391.41
[1]ZHAO M J,LI W,LI L,et al.Single-frame infrared small-target detection:a survey[J].IEEE Geoscience and Remote Sensing Magazine,2022,10(2):87-119.
[2]ZHAO X F,ZHAO Y J,HU S C,et al.Progress in active infrared imaging for defect detection in the renewable and electronic industries[J].Sensors,2023,23(21):8780.
[3]TANG W,HE F Z,LIU Y,et al.DATFuse:infrared and visible image fusion via dual attention transformer[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(7):3159-3172.
[4]HOU Y,VOLK R,SOIBELMAN L.A novel building temperature simulation approach driven by expanding semantic segmentation training datasets with synthetic aerial thermal images[J].Energies,2021,14(2):353.
[5]POGLIO T,MATHIEU-MARNI S,RANCHIN T,et al.OSIrIS:a physically based simulation tool to improve training in thermal infrared remote sensing over urban areas at high spatial resolution[J].Remote Sensing of Environment,2006,104(2):238-246.
[6]KNIAZ V V,KNYAZ V A,HLADUVKA J,et al.ThermalGAN:multimodal color-to-thermal image translation for person re-identification in multispectral dataset[C]//Proceedings of the European Conference on Computer Vision(ECCV) Workshops.Munich:Germany,2018:606-624.
[7]MA D C,XIAN Y,LI B,et al.Visible-to-infrared image translation based on an improved CGAN[J].The Visual Computer,2024,40(2):1289-1298.
[8]WANG H N,LI N,ZHAO H J,et al.MappingFormer:learning cross-modal feature mapping for visible-to-infrared image translation[C]//Proceedings of the 32nd ACM International Confe-rence on Multimedia.Melbourne:Australia,2024:10745-10754.
[9]HAN Z H,ZHANG S,SU Y R,et al.DR-AVIT:toward diverse and realistic aerial visible-to-infrared image translation[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62(5):1-13.
[10]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.Nevada:USA,2014:2672-2680.
[11]ISOLA P,ZHU J Y,ZHOU T H,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:USA,2017:1125-1134.
[12]ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:Italy,2017:2223-2232.
[13]LIU M Y,BREUEL T,KAUTZ J.Unsupervised image-to-image translation networks[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.California:USA,2017:700-708.
[14]FU H,GONG M M,WANG C H,et al.Geometry-consistentgenerative adversarial networks for one-sided unsupervised domain mapping[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.California:USA,2019:2427-2436.
[15]PARK T,EFROS A A,ZHANG R,et al.Contrastive learning for unpaired image-to-image translation[C]//Proceedings of the 16th European Conference on Computer Vision.Glasgow:UK,2020:319-345.
[16]HAN J,SHOEIBY M,PETERSSON L,et al.Dual contrastivelearning for unsupervised image-to-image translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:USA,2021:746-755.
[17]LI B,XUE K T,LIU B,et al.BBDM:Image-to-Image Translation with Brownian Bridge Diffusion Models[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2023:1952-1961.
[18]XIA M F,ZHOU Y,YI R,et al.A Diffusion Model Translator for Efficient Image-to-Image Translation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(12):10272-10283.
[19]CHEN T,KORNBLITH S,NOROUZI M,et al.A simpleframework for contrastive learning of visual representations[C]//Proceedings of the 37th International Conference on Machine Learning.Vienna:Austria,2020:1597-1607.
[20]HU X Q,ZHOU X Y,HUANG Q S,et al.Qs-attn:query-se-lected attention for contrastive learning in i2i translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans:USA,2022:18291-18300.
[21]JUNG C,KWON G,YE J C.Exploring patch-wise semantic relation for contrastive learning in image-to-image translation tasks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans:USA,2022:18260-18269.
[22]HAN Z H,ZHANG Z Y,ZHANG S,et al.Aerial visible-to-in-frared image translation:dataset,evaluation,and baseline[J].Journal of Remote Sensing,2023,3(1):96.
[1] WU Jufeng, ZHAO Xungang, ZHOU Qiang, RAO Ning. Contrastive Learning for Low-light Image Enhancement [J]. Computer Science, 2023, 50(6A): 220600171-6.
[2] CHEN Wanze, CHEN Jiazhen, HUANG Liqing, YE Feng, HUANG Tianqiang, LUO Haifeng. Controlled Facial Gender Forgery Combining Wavelet Transform High Frequency Information [J]. Computer Science, 2023, 50(11A): 221000241-10.
[3] ZHANG Yang, MA Xiao-hu. Anime Character Portrait Generation Algorithm Based on Improved Generative Adversarial Networks [J]. Computer Science, 2021, 48(1): 182-189.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!