计算机科学 ›› 2026, Vol. 53 ›› Issue (4): 260-268.doi: 10.11896/jsjkx.250700172
程梓萌1,2, 杨馨悦1,2, 艾浩军1,2, 王中元3
CHENG Zimeng1,2, YANG Xinyue1,2, AI Haojun1,2, WANG Zhongyuan3
摘要: 红外图像在计算机视觉领域应用广泛。受制于采集条件,高质量红外图像数据集规模较小。把可见光图像转换为红外图像,是扩充红外数据集的有效手段。现有生成方法多依赖有监督学习,需要大量配对数据。为此,提出基于双重语义对比学习的无监督红外图像生成方法DSCGAN。该方法采用双向转换架构,通过语义对比学习增强图像内容保持能力和红外特征学习能力。损失函数增加几何一致性损失,协助保留可见光图像的原始结构与细节。同时,构建多尺度PatchGAN判别器,增强判别能力,提升生成图片的真实感。在AVIID-1,AVIID-2和Day-DroneVehicle数据集上的实验表明,DSCGAN在多项指标上优于对比方法,生成的红外图像热辐射分布更合理,视觉质量更优。在AVIID-1数据集中,DSCGAN的 SSIM值提升至0.814 4,FID分数降低至0.145 6。在Day-DroneVehicle数据集中,DSCGAN的PSNR值提升至18.14,LPIPS值降低至0.294 9。所提方法为无监督红外图像生成提供了新思路,可进一步应用于红外目标检测和场景分割等下游任务。
中图分类号:
| [1]ZHAO M J,LI W,LI L,et al.Single-frame infrared small-target detection:a survey[J].IEEE Geoscience and Remote Sensing Magazine,2022,10(2):87-119. [2]ZHAO X F,ZHAO Y J,HU S C,et al.Progress in active infrared imaging for defect detection in the renewable and electronic industries[J].Sensors,2023,23(21):8780. [3]TANG W,HE F Z,LIU Y,et al.DATFuse:infrared and visible image fusion via dual attention transformer[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(7):3159-3172. [4]HOU Y,VOLK R,SOIBELMAN L.A novel building temperature simulation approach driven by expanding semantic segmentation training datasets with synthetic aerial thermal images[J].Energies,2021,14(2):353. [5]POGLIO T,MATHIEU-MARNI S,RANCHIN T,et al.OSIrIS:a physically based simulation tool to improve training in thermal infrared remote sensing over urban areas at high spatial resolution[J].Remote Sensing of Environment,2006,104(2):238-246. [6]KNIAZ V V,KNYAZ V A,HLADUVKA J,et al.ThermalGAN:multimodal color-to-thermal image translation for person re-identification in multispectral dataset[C]//Proceedings of the European Conference on Computer Vision(ECCV) Workshops.Munich:Germany,2018:606-624. [7]MA D C,XIAN Y,LI B,et al.Visible-to-infrared image translation based on an improved CGAN[J].The Visual Computer,2024,40(2):1289-1298. [8]WANG H N,LI N,ZHAO H J,et al.MappingFormer:learning cross-modal feature mapping for visible-to-infrared image translation[C]//Proceedings of the 32nd ACM International Confe-rence on Multimedia.Melbourne:Australia,2024:10745-10754. [9]HAN Z H,ZHANG S,SU Y R,et al.DR-AVIT:toward diverse and realistic aerial visible-to-infrared image translation[J].IEEE Transactions on Geoscience and Remote Sensing,2024,62(5):1-13. [10]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative adversarial nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems.Nevada:USA,2014:2672-2680. [11]ISOLA P,ZHU J Y,ZHOU T H,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:USA,2017:1125-1134. [12]ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice:Italy,2017:2223-2232. [13]LIU M Y,BREUEL T,KAUTZ J.Unsupervised image-to-image translation networks[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.California:USA,2017:700-708. [14]FU H,GONG M M,WANG C H,et al.Geometry-consistentgenerative adversarial networks for one-sided unsupervised domain mapping[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.California:USA,2019:2427-2436. [15]PARK T,EFROS A A,ZHANG R,et al.Contrastive learning for unpaired image-to-image translation[C]//Proceedings of the 16th European Conference on Computer Vision.Glasgow:UK,2020:319-345. [16]HAN J,SHOEIBY M,PETERSSON L,et al.Dual contrastivelearning for unsupervised image-to-image translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Nashville:USA,2021:746-755. [17]LI B,XUE K T,LIU B,et al.BBDM:Image-to-Image Translation with Brownian Bridge Diffusion Models[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2023:1952-1961. [18]XIA M F,ZHOU Y,YI R,et al.A Diffusion Model Translator for Efficient Image-to-Image Translation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2024,46(12):10272-10283. [19]CHEN T,KORNBLITH S,NOROUZI M,et al.A simpleframework for contrastive learning of visual representations[C]//Proceedings of the 37th International Conference on Machine Learning.Vienna:Austria,2020:1597-1607. [20]HU X Q,ZHOU X Y,HUANG Q S,et al.Qs-attn:query-se-lected attention for contrastive learning in i2i translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans:USA,2022:18291-18300. [21]JUNG C,KWON G,YE J C.Exploring patch-wise semantic relation for contrastive learning in image-to-image translation tasks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New Orleans:USA,2022:18260-18269. [22]HAN Z H,ZHANG Z Y,ZHANG S,et al.Aerial visible-to-in-frared image translation:dataset,evaluation,and baseline[J].Journal of Remote Sensing,2023,3(1):96. |
| [1] | 孟思雨, 牛春翔, 谭荃戈, 王蓉. 位置增强与频域分量交互的深度伪造检测方法 Deepfake Detection Method Based on Positional Enhancement and Frequency Domain ComponentInteraction 计算机科学, 2026, 53(4): 445-453. https://doi.org/10.11896/jsjkx.250700070 |
| [2] | 许立君, 赵宇杰, 赵敏, 马为駽, 陈侃松. 基于多粒度特征聚合与二分搜索的高效多视图立体重建 Efficient Multi-view Stereo Reconstruction Based on Multi-granularity Feature Aggregation and Binary Search 计算机科学, 2026, 53(3): 257-265. https://doi.org/10.11896/jsjkx.250200094 |
| [3] | 李昂, 章杰元, 刘逊韵. 基于双向交叉注意力跨域融合的航拍图像伪装目标识别方法 Camouflaged Object Detection for Aerial Images Based on Bidirectional Cross-attentionCross-domain Fusion 计算机科学, 2026, 53(1): 173-179. https://doi.org/10.11896/jsjkx.250300009 |
| [4] | 卜韵阳, 齐彬廷, 卜凡亮. 跨模态不一致感知下双视角交互融合的多模态情感分析 Multimodal Sentiment Analysis for Interactive Fusion of Dual Perspectives Under Cross-modalInconsistent Perception 计算机科学, 2026, 53(1): 187-194. https://doi.org/10.11896/jsjkx.241100029 |
| [5] | 吕景刚, 高硕, 李玉芝, 周金. 通道注意力指导全局-局部语义协同的表情识别 Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation 计算机科学, 2026, 53(1): 195-205. https://doi.org/10.11896/jsjkx.250900051 |
| [6] | 曹明伟, 黄宝龙, 赵海峰. 基于外观增强和语义分割的神经辐射场 Appearance Enhancement and Semantic Segmentation-based Neural Radiance Fields 计算机科学, 2025, 52(12): 141-149. https://doi.org/10.11896/jsjkx.250400075 |
| [7] | 夏淑芳, 尹昊楠, 瞿中. ETF-YOLO11n:交通图像的多尺度特征融合目标检测方法 ETF-YOLO11n:Object Detection Method Based on Multi-scale Feature Fusion for TrafficImages 计算机科学, 2025, 52(12): 150-157. https://doi.org/10.11896/jsjkx.241200021 |
| [8] | 陈康, 林建涵, 刘元杰. 图像去模糊算法研究综述 Survey on Image Deblurring Algorithms 计算机科学, 2025, 52(11): 98-112. https://doi.org/10.11896/jsjkx.241200045 |
| [9] | 段鹏松, 高杨, 张大龙, 曹仰杰, 赵杰. C2P-YOLO:一种轻量级的风电塔筒裂缝检测算法 C2P-YOLO:A Lightweight Crack Detection Algorithm for Wind Turbine Towers 计算机科学, 2025, 52(11A): 250100126-6. https://doi.org/10.11896/jsjkx.250100126 |
| [10] | 陈岐, 孙瑾, 汪纪钢, 黄长城. 基于视觉损失的低照度增强图像多准则质量评价方法 Multi-criteria Quality Assessment Method for Low-illumination Enhanced Images Based on Visual Loss 计算机科学, 2025, 52(11A): 241100114-7. https://doi.org/10.11896/jsjkx.241100114 |
| [11] | 纪涛, 杨一帆, 冯亚春, 伍凌帆, 李旭亮, 李亚伟. 基于局部特征和特征融合的无人驾驶场景目标检测方法 Unmanned Driving Scene Object Detection Method Based on Local Features and Feature Fusion 计算机科学, 2025, 52(11A): 250200051-7. https://doi.org/10.11896/jsjkx.250200051 |
| [12] | 罗月童, 董子秋, 彭俊, 赵东晟. 面向聚变堆冷却管可视化的管道中心线提取方法研究与应用 Research and Application of Pipe Center-line Extraction Method for Fusion Reactor CoolingPipe Visualization 计算机科学, 2025, 52(11A): 241000137-5. https://doi.org/10.11896/jsjkx.241000137 |
| [13] | 岳倩雯, 王东强, 张强. 融合自适应优化与多维聚焦的点云配准网络 Point Cloud Registration Network Integrating Adaptive Optimization and Multi-dimensional Focusing 计算机科学, 2025, 52(11A): 250100019-7. https://doi.org/10.11896/jsjkx.250100019 |
| [14] | 刘翘铭, 魏千然, 李智, 王健, 李远方. 基于张量图扩散的共享近邻密度峰值聚类算法 Tensor Graph Diffusion Share Nearest Neighbor Density Peaks Clustering 计算机科学, 2025, 52(11A): 241200068-11. https://doi.org/10.11896/jsjkx.241200068 |
| [15] | 尹诗, 施振扬, 吴梦麟, 蔡金燕, 余德. 基于深度学习的肾脏超声图像分割:现状与挑战 Deep Learning-based Kidney Segmentation in Ultrasound Imaging:Current Trends and Challenges 计算机科学, 2025, 52(9): 16-24. https://doi.org/10.11896/jsjkx.250300159 |
|
||