计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 123-130.doi: 10.11896/jsjkx.240800110

• 计算机图形学&多媒体 • 上一篇    下一篇

基于滚动MLP特征提取的红外与可见光图像融合跨模态对比表示网络

闫志林, 聂仁灿   

  1. 云南大学信息学院 昆明 650500
  • 收稿日期:2024-08-21 修回日期:2024-12-30 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 聂仁灿(rcnie@ynu.edu.cn)
  • 作者简介:(yanzhilin@stu.ynu.edu.cn)
  • 基金资助:
    云南省基础研究计划重点项目(202301AS070025);国家重点研发项目(2020YFA0714301);云南省科技厅项目基金(2012105AF150011);云南省教育厅科学研究基金(2024Y031,2023J0017)

Infrared and Visible Image Fusion Cross-modality Contrastive Representation Network Based on Rolling MLP Feature Extraction

YAN Zhilin, NIE Rencan   

  1. School of Information Science and Engineering,Yunnan University,Kunming 650500,China
  • Received:2024-08-21 Revised:2024-12-30 Online:2025-11-15 Published:2025-11-06
  • About author:YAN Zhilin,born in 1998,postgra-duate.His main research interests include deep learning and image fusion.
    NIE Rencan,born in 1982,Ph.D,professor,doctoral supervisor.His main research interests include neural networks,image processing and machine learning.
  • Supported by:
    Key Project of Yunnan Basic Research Program(202301AS070025), National Key Research and Development Program of China(2020YFA0714301),Science and Technology Department of Yunnan Province Project(202105AF150011) and Yunnan Provincial Department of Education Science Foundation(2024Y031,2023J0017).

摘要: 目前,红外与可见光图像融合任务中,数据集缺乏真实融合图像来对最终融合图像所需的两种模态重要的差异信息进行引导。现有的融合方法大多数只考虑了源图像的权衡和交互,忽略了融合图像在融合过程中的作用。融合图像中的重要信息可以对源图像的差异信息进行约束,因此提出了一种对比表示网络(Contrastive Representation Network,CRN)来更好地引导融合图像所需源图像中重要信息的提取;同时,提高融合图像重建的质量可以进一步加强对源图像重要特征信息的引导。重建图像的质量与提取特征相关,现有的特征提取方法中,CNN在全局特征的捕获上表现欠佳,而Transformer的计算复杂度高,对局部特征的学习能力较差。在此基础上,引入了一种结合MLP的CNN模块D2 Block,通过对不同方向上的特征映射进行滚动操作,有效提取并融合局部特征和远程依赖关系。在多个公共数据集上的大量定性和定量实验表明,该方法相比其他先进方法取得了更好的结果。

关键词: 图像融合, 对比表示学习, 特征提取, 深度学习, 自编码器

Abstract: At present,in the fusion of infrared and visible image,dataset lacks the real fusion image to guide the important diffe-rence information of the two modes required for the final fusion image.Most of the existing fusion methods only consider the tradeoff and interaction of the source image,ignoring the role of the fusion image in the fusion process.The important information in the fusion image can constrain the difference information of the source image.Therefore,this paper proposesCRN to better guide the extraction of important information in the source image required by the fused image.At the same time,improving the quality of fusion image reconstruction can further strengthen the guidance of important feature information of source image.The quality of the reconstructed image is related to the extracted features.Among existing feature extraction methods,CNN has poor performance in capturing global features,while Transformer has high computational complexity and poor learning ability of local features.On this basis,a CNN module D2 Block combined with MLP is introduced,which can effectively extract and fuse local features and remote dependencies by rolling feature mappings in different directions.A large number of qualitative and quantitative experiments on several public data sets show that the proposed method achieves better results than other advanced methods.

Key words: Image fusion, Contrastive representation learning, Feature extraction, Deep learning, Autoencoder

中图分类号: 

  • TP391
[1]MA J,MA Y,LI C.Infrared and visible image fusion methods and applications:A survey [J].Information fusion,2019,45:153-178.
[2]LI C,LIANG X,LU Y,et al.RGB-T object tracking:Benchmark and baseline [J].Pattern Recognition,2019,96:106977.
[3]KRISTAN M,MATAS J,LEONARDIS A,et al.The seventhvisual object tracking VOT2019 challenge results[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision Workshops.2019.
[4]SHRINIDHI V,YADAV P,VENKATESWARAN N.IR andvisible video fusion for surveillance[C]//Proceedings of the 2018 International Conference on Wireless Communications,Signal Processing and Networking.IEEE,2018.
[5]BAVIRISETTI D P,DHULI R.Two-scale image fusion of visible and infrared images using saliency detection [J].Infrared Physics & Technology,2016,76:52-64.
[6]ZHANG X,MA Y,FAN F,et al.Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition [J].JOSA A,2017,34(8):1400-1410.
[7]LI S,KANG X,HU J.Image fusion with guided filtering [J].IEEE Transactions on Image processing,2013,22(7):2864-2875.
[8]PAJARES G,DE LA CRUZ J M.A wavelet-based image fusion tutorial [J].Pattern Recognition,2004,37(9):1855-1872.
[9]ZHANG Z,BLUM R S.A categorization of multiscale-decomposition-based image fusion schemes with a performance study for a digital camera application [J].Proceedings of the IEEE,1999,87(8):1315-1326.
[10]KONG W,ZHANG L,LEI Y.Novel fusion method for visible light and infrared images based on NSST-SF-PCNN [J].Infrared Physics & Technology,2014,65:103-112.
[11]XIANG T,YAN L,GAO R.A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain [J].Infrared Physics & Technology,2015,69:53-61.
[12]YANG B,LI S.Visual attention guided image fusion with sparse representation [J].Optik,2014,125(17):4881-4888.
[13]LI S,YIN H,FANG L.Group-sparse representation with dictionary learning for medical image denoising and fusion [J].IEEE Transactions on Biomedical Engineering,2012,59(12):3450-3459.
[14]BAVIRISETTI D P,XIAO G,LIU G.Multi-sensor image fusion based on fourth order partial differential equations[C]//Proceedings of the 2017 20th International Conference on Information Fusion.IEEE,2017.
[15]ZHANG H,MA J.SDNet:A versatile squeeze-and-decomposi-tion network for real-time image fusion [J].International Journal of Computer Vision,2021,129(10):2761-2785.
[16]XU H,MA J,JIANG J,et al.U2Fusion:A unified unsupervised image fusion network [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):502-518.
[17]LI H,WU X J.DenseFuse:A fusion approach to infrared and visible images [J].IEEE Transactions on Image Processing,2018,28(5):2614-2623.
[18]ZHANG Y,LIU Y,SUN P,et al.IFCNN:A general image fusion framework based on convolutional neural network [J].Information Fusion,2020,54:99-118.
[19]ZHANG G,NIE R,CAO J.SSL-WAEIE:Self-supervised lear-ning with weighted auto-encoding and information exchange for infrared and visible image fusion [J].IEEE/CAA Journal of Automatica Sinica,2022,9(9):1694-1697.
[20]WANG Z,WANG J,WU Y,et al.UNFusion:A unified multi-scale densely connected network for infrared and visible image fusion [J].IEEE Transactions on Circuits and Systems for Vi-deo Technology,2021,32(6):3360-3374.
[21]MA J,ZHANG H,SHAO Z,et al.GANMcC:A generative adversarial network with multiclassification constraints for infrared and visible image fusion [J].IEEE Transactions on Instrumentation and Measurement,2020,70:1-14.
[22]MA J,XU H,JIANG J,et al.DDcGAN:A dual-discriminator conditional generative adversarial network for multi-resolution image fusion [J].IEEE Transactions on Image Processing,2020,29:4980-4995.
[23]ZHOU H,WU W,ZHANG Y,et al.Semantic-supervised infrared and visible image fusion via a dual-discriminator generative adversarial network [J].IEEE Transactions on Multimedia,2021,25:635-648.
[24]MA J,TANG L,FAN F,et al.SwinFusion:Cross-domain long-range learning for general image fusion via swin transformer [J].IEEE/CAA Journal of Automatica Sinica,2022,9(7):1200-1217.
[25]WANG Z,CHEN Y,SHAO W,et al.SwinFuse:A residual swin transformer fusion network for infrared and visible images [J].IEEE Transactions on Instrumentation and Measurement,2022,71:1-12.
[26]TANG W,HE F,LIU Y,et al.DATFuse:Infrared and visible image fusion via dual attention transformer [J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(7):3159-3172.
[27]TANG L,YUAN J,MA J.Image fusion in the loop of high-level vision tasks:A semantic-aware real-time infrared and visible image fusion network [J].Information Fusion,2022,82:28-42.
[28]CHEN X,HE K.Exploring simple siamese representation lear-ning[C]//Proceedings of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2021.
[29]LIU Y,ZHU H,LIU M,et al.Rolling-Unet:Revitalizing MLP'sAbility to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024.
[30]MA J,ZHOU Y.Infrared and visible image fusion via gradientlet filter [J].Computer Vision and Image Understanding,2020,197:103016.
[31]LU M,JIANG M,KONG J,et al.LDRepFM:a real-time end-to-end visible and infrared image fusion model based on layer decomposition and re-parameterization [J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-12.
[32]LI H,XU T,WU X J,et al.Lrrnet:A novel representationlearning guided fusion network for infrared and visible images [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(9):11040-11052.
[33]NIE R,MA C,CAO J,et al.A total variation with joint norms for infrared and visible image fusion [J].IEEE Transactions on Multimedia,2021,24:1460-1472.
[34]LI S,HONG R,WU X.A novel similarity based quality metric for image fusion[C]//Proceedings of the 2008 International Conference on Audio,Language and Image Processing.IEEE,2008.
[35]XYDEAS C S,PETROVIC V.Objective image fusion performance measure [J].Electronics Letters,2000,36(4):308-309.
[36]HAN Y,CAI Y,CAO Y,et al.A new image fusion performancemetric based on visual information fidelity [J].Information Fusion,2013,14(2):127-135.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!