Computer Science ›› 2025, Vol. 52 ›› Issue (11): 123-130.doi: 10.11896/jsjkx.240800110

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Infrared and Visible Image Fusion Cross-modality Contrastive Representation Network Based on Rolling MLP Feature Extraction

YAN Zhilin, NIE Rencan   

  1. School of Information Science and Engineering,Yunnan University,Kunming 650500,China
  • Received:2024-08-21 Revised:2024-12-30 Online:2025-11-15 Published:2025-11-06
  • About author:YAN Zhilin,born in 1998,postgra-duate.His main research interests include deep learning and image fusion.
    NIE Rencan,born in 1982,Ph.D,professor,doctoral supervisor.His main research interests include neural networks,image processing and machine learning.
  • Supported by:
    Key Project of Yunnan Basic Research Program(202301AS070025), National Key Research and Development Program of China(2020YFA0714301),Science and Technology Department of Yunnan Province Project(202105AF150011) and Yunnan Provincial Department of Education Science Foundation(2024Y031,2023J0017).

Abstract: At present,in the fusion of infrared and visible image,dataset lacks the real fusion image to guide the important diffe-rence information of the two modes required for the final fusion image.Most of the existing fusion methods only consider the tradeoff and interaction of the source image,ignoring the role of the fusion image in the fusion process.The important information in the fusion image can constrain the difference information of the source image.Therefore,this paper proposesCRN to better guide the extraction of important information in the source image required by the fused image.At the same time,improving the quality of fusion image reconstruction can further strengthen the guidance of important feature information of source image.The quality of the reconstructed image is related to the extracted features.Among existing feature extraction methods,CNN has poor performance in capturing global features,while Transformer has high computational complexity and poor learning ability of local features.On this basis,a CNN module D2 Block combined with MLP is introduced,which can effectively extract and fuse local features and remote dependencies by rolling feature mappings in different directions.A large number of qualitative and quantitative experiments on several public data sets show that the proposed method achieves better results than other advanced methods.

Key words: Image fusion, Contrastive representation learning, Feature extraction, Deep learning, Autoencoder

CLC Number: 

  • TP391
[1]MA J,MA Y,LI C.Infrared and visible image fusion methods and applications:A survey [J].Information fusion,2019,45:153-178.
[2]LI C,LIANG X,LU Y,et al.RGB-T object tracking:Benchmark and baseline [J].Pattern Recognition,2019,96:106977.
[3]KRISTAN M,MATAS J,LEONARDIS A,et al.The seventhvisual object tracking VOT2019 challenge results[C]//Procee-dings of the IEEE/CVF International Conference on Computer Vision Workshops.2019.
[4]SHRINIDHI V,YADAV P,VENKATESWARAN N.IR andvisible video fusion for surveillance[C]//Proceedings of the 2018 International Conference on Wireless Communications,Signal Processing and Networking.IEEE,2018.
[5]BAVIRISETTI D P,DHULI R.Two-scale image fusion of visible and infrared images using saliency detection [J].Infrared Physics & Technology,2016,76:52-64.
[6]ZHANG X,MA Y,FAN F,et al.Infrared and visible image fusion via saliency analysis and local edge-preserving multi-scale decomposition [J].JOSA A,2017,34(8):1400-1410.
[7]LI S,KANG X,HU J.Image fusion with guided filtering [J].IEEE Transactions on Image processing,2013,22(7):2864-2875.
[8]PAJARES G,DE LA CRUZ J M.A wavelet-based image fusion tutorial [J].Pattern Recognition,2004,37(9):1855-1872.
[9]ZHANG Z,BLUM R S.A categorization of multiscale-decomposition-based image fusion schemes with a performance study for a digital camera application [J].Proceedings of the IEEE,1999,87(8):1315-1326.
[10]KONG W,ZHANG L,LEI Y.Novel fusion method for visible light and infrared images based on NSST-SF-PCNN [J].Infrared Physics & Technology,2014,65:103-112.
[11]XIANG T,YAN L,GAO R.A fusion algorithm for infrared and visible images based on adaptive dual-channel unit-linking PCNN in NSCT domain [J].Infrared Physics & Technology,2015,69:53-61.
[12]YANG B,LI S.Visual attention guided image fusion with sparse representation [J].Optik,2014,125(17):4881-4888.
[13]LI S,YIN H,FANG L.Group-sparse representation with dictionary learning for medical image denoising and fusion [J].IEEE Transactions on Biomedical Engineering,2012,59(12):3450-3459.
[14]BAVIRISETTI D P,XIAO G,LIU G.Multi-sensor image fusion based on fourth order partial differential equations[C]//Proceedings of the 2017 20th International Conference on Information Fusion.IEEE,2017.
[15]ZHANG H,MA J.SDNet:A versatile squeeze-and-decomposi-tion network for real-time image fusion [J].International Journal of Computer Vision,2021,129(10):2761-2785.
[16]XU H,MA J,JIANG J,et al.U2Fusion:A unified unsupervised image fusion network [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,44(1):502-518.
[17]LI H,WU X J.DenseFuse:A fusion approach to infrared and visible images [J].IEEE Transactions on Image Processing,2018,28(5):2614-2623.
[18]ZHANG Y,LIU Y,SUN P,et al.IFCNN:A general image fusion framework based on convolutional neural network [J].Information Fusion,2020,54:99-118.
[19]ZHANG G,NIE R,CAO J.SSL-WAEIE:Self-supervised lear-ning with weighted auto-encoding and information exchange for infrared and visible image fusion [J].IEEE/CAA Journal of Automatica Sinica,2022,9(9):1694-1697.
[20]WANG Z,WANG J,WU Y,et al.UNFusion:A unified multi-scale densely connected network for infrared and visible image fusion [J].IEEE Transactions on Circuits and Systems for Vi-deo Technology,2021,32(6):3360-3374.
[21]MA J,ZHANG H,SHAO Z,et al.GANMcC:A generative adversarial network with multiclassification constraints for infrared and visible image fusion [J].IEEE Transactions on Instrumentation and Measurement,2020,70:1-14.
[22]MA J,XU H,JIANG J,et al.DDcGAN:A dual-discriminator conditional generative adversarial network for multi-resolution image fusion [J].IEEE Transactions on Image Processing,2020,29:4980-4995.
[23]ZHOU H,WU W,ZHANG Y,et al.Semantic-supervised infrared and visible image fusion via a dual-discriminator generative adversarial network [J].IEEE Transactions on Multimedia,2021,25:635-648.
[24]MA J,TANG L,FAN F,et al.SwinFusion:Cross-domain long-range learning for general image fusion via swin transformer [J].IEEE/CAA Journal of Automatica Sinica,2022,9(7):1200-1217.
[25]WANG Z,CHEN Y,SHAO W,et al.SwinFuse:A residual swin transformer fusion network for infrared and visible images [J].IEEE Transactions on Instrumentation and Measurement,2022,71:1-12.
[26]TANG W,HE F,LIU Y,et al.DATFuse:Infrared and visible image fusion via dual attention transformer [J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(7):3159-3172.
[27]TANG L,YUAN J,MA J.Image fusion in the loop of high-level vision tasks:A semantic-aware real-time infrared and visible image fusion network [J].Information Fusion,2022,82:28-42.
[28]CHEN X,HE K.Exploring simple siamese representation lear-ning[C]//Proceedings of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2021.
[29]LIU Y,ZHU H,LIU M,et al.Rolling-Unet:Revitalizing MLP'sAbility to Efficiently Extract Long-Distance Dependencies for Medical Image Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024.
[30]MA J,ZHOU Y.Infrared and visible image fusion via gradientlet filter [J].Computer Vision and Image Understanding,2020,197:103016.
[31]LU M,JIANG M,KONG J,et al.LDRepFM:a real-time end-to-end visible and infrared image fusion model based on layer decomposition and re-parameterization [J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-12.
[32]LI H,XU T,WU X J,et al.Lrrnet:A novel representationlearning guided fusion network for infrared and visible images [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(9):11040-11052.
[33]NIE R,MA C,CAO J,et al.A total variation with joint norms for infrared and visible image fusion [J].IEEE Transactions on Multimedia,2021,24:1460-1472.
[34]LI S,HONG R,WU X.A novel similarity based quality metric for image fusion[C]//Proceedings of the 2008 International Conference on Audio,Language and Image Processing.IEEE,2008.
[35]XYDEAS C S,PETROVIC V.Objective image fusion performance measure [J].Electronics Letters,2000,36(4):308-309.
[36]HAN Y,CAI Y,CAO Y,et al.A new image fusion performancemetric based on visual information fidelity [J].Information Fusion,2013,14(2):127-135.
[1] LIU Wei, XU Yong, FANG Juan, LI Cheng, ZHU Yujun, FANG Qun, HE Xin. Multimodal Air-writing Gesture Recognition Based on Radar-Vision Fusion [J]. Computer Science, 2025, 52(9): 259-268.
[2] YIN Shi, SHI Zhenyang, WU Menglin, CAI Jinyan, YU De. Deep Learning-based Kidney Segmentation in Ultrasound Imaging:Current Trends and Challenges [J]. Computer Science, 2025, 52(9): 16-24.
[3] ZENG Lili, XIA Jianan, LI Shaowen, JING Maike, ZHAO Huihui, ZHOU Xuezhong. M2T-Net:Cross-task Transfer Learning Tongue Diagnosis Method Based on Multi-source Data [J]. Computer Science, 2025, 52(9): 47-53.
[4] LI Yaru, WANG Qianqian, CHE Chao, ZHU Deheng. Graph-based Compound-Protein Interaction Prediction with Drug Substructures and Protein 3D Information [J]. Computer Science, 2025, 52(9): 71-79.
[5] LUO Chi, LU Lingyun, LIU Fei. Partial Differential Equation Solving Method Based on Locally Enhanced Fourier NeuralOperators [J]. Computer Science, 2025, 52(9): 144-151.
[6] LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan. Survey of Data Classification and Grading Studies [J]. Computer Science, 2025, 52(9): 195-211.
[7] TANG Boyuan, LI Qi. Review on Application of Spatial-Temporal Graph Neural Network in PM2.5 ConcentrationForecasting [J]. Computer Science, 2025, 52(8): 71-85.
[8] DING Zhengze, NIE Rencan, LI Jintao, SU Huaping, XU Hang. MTFuse:An Infrared and Visible Image Fusion Network Based on Mamba and Transformer [J]. Computer Science, 2025, 52(8): 188-194.
[9] YANG Feixia, LI Zheng, MA Fei. Research on Hyperspectral Image Super-resolution Methods Based on Tensor Ring SubspaceSmoothing and Graph Regularization [J]. Computer Science, 2025, 52(8): 240-250.
[10] LIU Zhengyu, ZHANG Fan, QI Xiaofeng, GAO Yanzhao, SONG Yijing, FAN Wang. Review of Research on Deep Learning Compiler [J]. Computer Science, 2025, 52(8): 29-44.
[11] ZHENG Cheng, YANG Nan. Aspect-based Sentiment Analysis Based on Syntax,Semantics and Affective Knowledge [J]. Computer Science, 2025, 52(7): 218-225.
[12] LI Mengxi, GAO Xindan, LI Xue. Two-way Feature Augmentation Graph Convolution Networks Algorithm [J]. Computer Science, 2025, 52(7): 127-134.
[13] FAN Xing, ZHOU Xiaohang, ZHANG Ning. Review on Methods and Applications of Short Text Similarity Measurement in Social Media Platforms [J]. Computer Science, 2025, 52(6A): 240400206-8.
[14] YANG Jixiang, JIANG Huiping, WANG Sen, MA Xuan. Research Progress and Challenges in Forest Fire Risk Prediction [J]. Computer Science, 2025, 52(6A): 240400177-8.
[15] WANG Jiamin, WU Wenhong, NIU Hengmao, SHI Bao, WU Nier, HAO Xu, ZHANG Chao, FU Rongsheng. Review of Concrete Defect Detection Methods Based on Deep Learning [J]. Computer Science, 2025, 52(6A): 240900137-12.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!