计算机科学 ›› 2025, Vol. 52 ›› Issue (8): 188-194.doi: 10.11896/jsjkx.240600106
丁政泽, 聂仁灿, 李锦涛, 苏华平, 徐航
DING Zhengze, NIE Rencan, LI Jintao, SU Huaping, XU Hang
摘要: 红外与可见光图像融合旨在保留红外图像的热辐射信息和可见光图像的纹理细节,以表示成像场景并全面促进下游视觉任务。基于卷积神经网络的融合模型由于专注于局部卷积运算,在捕获全局图像特征方面遇到限制。基于Transformer的模型虽然在全局特征建模方面表现出色,但也面临着二次复杂性带来的计算挑战。选择性结构化状态空间模型(Mamba)在具有线性复杂性的远程依赖建模方面表现出了巨大的潜力,为解决上述问题提供了一条有希望的路径。为了高效建模图像远程依赖,设计了一个残差选择性结构化状态空间模块(RMB)提取全局特征。同时,为了对多模态图像之间的关系进行建模,设计了一个跨模态查询融合注意力模块(CQAM)用于特征的自适应融合。此外,设计了一个由两项组成的损失函数,包括梯度损失和亮度损失,旨在以无监督的方式训练所提出的模型。与大量其他先进的方法在融合质量的对比实验和消融实验上证明了所提出的方法的有效性。
中图分类号:
[1]CHEN H,DENG L,ZHU L,et al.ECFuse:Edge-Consistent and Correlation-Driven Fusion Framework for Infrared and Visible Image Fusion [J].Sensors,2023,23(19):8071. [2]KAUR H,KOUNDAL D,KADYAN V.Image fusion tech-niques:a survey [J].Archives of Computational Methods in Engineering,2021,28(7):4425-4447. [3]ZHAO W,XIE S,ZHAO F,et al.Metafusion:Infrared and visible image fusion via meta-feature embedding from object detection[C]//Proceeding of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2023. [4]ZHAO Z,XU S,ZHANG J,et al.Efficient and model-based infrared and visible image fusion via algorithm unrolling [J].IEEE Transactions on Circuits and Systems for Video Technology,2021,32(3):1186-1196. [5]MA J,MA Y,LI C.Infrared and visible image fusion methods and applications:A survey [J].Information Fusion,2019,45:153-178. [6]TANG W,LIU Y,CHENG J,et al.A phase congruency-based green fluorescent protein and phase contrast image fusion me-thod in nonsubsampled shearlet transform domain [J].Microscopy Research and Technique,2020,83(10):1225-1234. [7]ZHANG Q,LIU Y,BLUM R S,et al.Sparse representationbased multi-sensor image fusion for multi-focus and multi-modality images:A review [J].Information Fusion,2018,40:57-75. [8]KONG W,LEI Y,ZHAO H.Adaptive fusion method of visible light and infrared images based on non-subsampled shearlet transform and fast non-negative matrix factorization [J].Infrared Physics & Technology,2014,67:161-172. [9]MA J,TANG L,FAN F,et al.SwinFusion:Cross-domain long-range learning for general image fusion via swin transformer [J].IEEE/CAA Journal of Automatica Sinica,2022,9(7):1200-1217. [10]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [C]//Advances in Neural Information Processing Systems.2017. [11]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale [J].arXiv:2010.11929,2020. [12]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021. [13]ZAMIR S W,ARORA A,KHAN S,et al.Restormer:Efficient transformer for high-resolution image restoration[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022. [14]LU J,BATRA D,PARIKH D,et al.Vilbert:Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks [C]//Advances in Neural Information Processing systems.2019. [15]SUN Y,DONG L,HUANG S,et al.Retentive network:A successor to transformer for large language models [J].arXiv:2307.08621,2023. [16]GU A,DAO T.Mamba:Linear-time sequence modeling with selective state spaces [J].arXiv:2312.00752,2023. [17]LIU Y,TIAN Y,ZHAO Y,et al.Vmamba:Visual state space model [J].arXiv:2401.10166,2024. [18]HAMILTON J D.State-space models [J].Handbook of Econometrics,1994,4:3039-3080. [19]ZHAO D,SHU X,ZHANG L,et al.Sensor interrogation technique using chirped fibre grating based Sagnac loop [J].Electronics Letters,2002,38(7):312-313. [20]HAN Y,CAI Y,CAO Y,et al.A new image fusion performance metric based on visual information fidelity [J].Information Fusion,2013,14(2):127-135. [21]XYDEAS C S,PETROVIC V.Objective image fusion perfor-mance measure [J].Electronics Letters,2000,36(4):308-309. [22]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity [J].IEEE Transactions on Image Processing,2004,13(4):600-612. [23]ESKICIOGLU A M,FISHER P S.Image quality measures and their performance [J].IEEE Transactions on Communications,1995,43(12):2959-2965. [24]CUI G,FENG H,XU Z,et al.Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition [J].Optics Communications,2015,341:199-209. [25]JAGALINGAM P,HEGDE A V.A review of quality metrics for fused image [J].Aquatic Procedia,2015,4:133-142. [26]ZHAO Z,XU S,ZHANG C,et al.DIDFuse:Deep image decomposition for infrared and visible image fusion [J].arXiv:2003.09210,2020. [27]XU H,MA J,JIANG J,et al.U2Fusion:A unified unsupervised image fusion network [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):502-518. [28]MA J,ZHANG H,SHAO Z,et al.GANMcC:A generative adversarial network with multiclassification constraints for infrared and visible image fusion [J].IEEE Transactions on Instrumentation and Measurement,2020,70:1-14. [29]ZHANG H,MA J.SDNet:A versatile squeeze-and-decomposi-tion network for real-time image fusion [J].International Journal of Computer Vision,2021,129(10):2761-2785. [30]TANG W,HE F,LIU Y.YDTR:Infrared and visible image fusion via Y-shape dynamic transformer [J].IEEE Transactions on Multimedia,2022,25:5413-5428. [31]LIU J,FAN X,HUANG Z,et al.Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022. [32]LIANG P,JIANG J,LIU X,et al.Fusion from decomposition:A self-supervised decomposition approach for image fusion[C]//European Conference on Computer Vision.Springer,2022. [33]HUANG Z,LIU J,FAN X,et al.Reconet:Recurrent correction network for fast and efficient multi-modality image fusion[C]//European Conference on Computer Vision.Springer,2022. [34]TANG W,HE F,LIU Y,et al.DATFuse:Infrared and visible image fusion via dual attention transformer [J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(7):3159-3172. [35]LI H,XU T,WU X J,et al.LRRNet:A novel representationlearning guided fusion network for infrared and visible images [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(9):11040-11052. |
|