Computer Science ›› 2025, Vol. 52 ›› Issue (5): 212-219.doi: 10.11896/jsjkx.240300137

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Improved U-Net Multi-scale Feature Fusion Semantic Segmentation Network for RemoteSensing Images

JIANG Wenwen, XIA Ying   

  1. College of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
    Key Laboratory of Tourism Multisource Data Perception and Decision Technology,Ministry of Culture and Tourism,Chongqing 400065,China
  • Received:2024-03-20 Revised:2024-07-22 Online:2025-05-15 Published:2025-05-12
  • About author:JIANG Wenwen,born in 2000,postgraduate.Her main research interests include intelligent analysis of remote sensing images.
    XIA Ying,born in 1972,Ph.D,professor,Ph.D supervisor,is a senior member of CCF(No.10248S).Her main research interests include spatiotemporal big data and cross-media retrieval.
  • Supported by:
    Chongqing Municipal Education Commission Cooperation Projects(HZ2021008) and Key Laboratory Funding Project of Cultural and Tourism Department(E020H2023005).

Abstract: High spatial resolution of remote sensing images,the large scale differences of different types of objects,and the imba-lance of categories are the main challenges faced by accurate semantic segmentation tasks.In order to improve the accuracy of semantic segmentation of remote sensing images,this paper proposes an improved U-Net multi-scale feature fusion semantic segmentation network for remote sensing image(Multi-scale Feature Fusion Network,MFFNet).The network is based on the U-Net network and includes a dynamic feature fusion module and a gated attention convolution mix module.Among them,the dynamic feature fusion module replaces the skip connection and improves the feature fusion method of the upsampling layer and the downsampling layer to avoid information loss caused by feature fusion,while improving the fusion effect of shallow features and deep features.Gated attention convolution mix module integrates self-attention,convolution,and gating mechanisms to better capture both local and global information.Comparative experiments and ablation experiments are carried out on Potsdam and Vaihingen.The results show that the mIoU of MFFNet on the two datasets reached 76.95% and 72.93% respectively,effectively improving the semantic segmentation accuracy of remote sensing images.

Key words: Semantic segmentation, Remote sensing images, Attention mechanism, Feature fusion, Gating mechanism

CLC Number: 

  • TP391
[1]WANG J,DING J,RAN S,et al.Automatic Pear Extractionfrom High-Resolution Images by a Visual Attention Mechanism Network[J].Remote Sensing,2023,15(13):3283-3298.
[2]MA Y.Research Review of Image Semantic Segmentation Methods in High-Resolution Remote Sensing Image Interpretation[J].Journal of Frontiers of ComputerScience and Technology,2023,17(7):1526-1548.
[3]ZHAN Z Y,AN Y J,C C W.Image Threshold Segmentation Algorithms and Comparative Research[J].Information and Communication,2017(4):86-89.
[4]LIANG Z X,WANG X B,HE T,et al.Research and implementation of instance segmentation and edge optimization algorithms[J].Journal of Graphics,2020,41(6):939-946.
[5]ADAMS R,BISCHOF L.Seeded region growing[J].IEEETransactions on Pattern Analysis and Machine Intelligence,1994,16(6):641-647.
[6]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[7]RONNEBERGER O,FISCHER P,BROX T.U-Net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Computer-assisted Intervention.Cham:Springer,2015:234-241.
[8]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[9]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[10]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[11]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vision transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[12]ZHENG S,LU J,ZHAO H,et al.Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers[C]//Computer Vision and Pattern Recognition.IEEE,2021:6877-6886.
[13]WANG S Y,YANG F.Remote Sensing Image Semantic Seg-mentation Method Based on U-Net Feature Fusion Optimization Strategy[J].Computer Science,2021,48(8):162-168.
[14]LI H,QIU K,CHEN L,et al.SCAttNet:Semantic segmentation network with spatial and channel attention mechanism for high-resolution remote sensing images[J].IEEE Geoscience and Remote Sensing Letters,2020,18(5):905-909.
[15]XU Z,ZHANG W,ZHANG T,et al.HRCNet:high-resolution context extraction network for semantic segmentation of remote sensing images[J].Remote Sensing,2020,13(1):71-93.
[16]YANG X,LI S,CHEN Z,et al.An attention-fused network for semantic segmentation of very-high resolution remote sensing imagery[J].ISPRS Journal of Photogrammetry and Remote Sensing,2021(177):238-262.
[17]WANG Q,GUO L G,CHENG W T.A Method for Extracting Buildings from Remote Sensing Images Based on Lightweight NDFEDet-SOLOv2[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2024(6):20-29.
[18]LIU Y,SHI S,WANG J,et al.Seeing Beyond the Patch:Scale-Adaptive Semantic Segmentation of High-resolution Remote Sensing Imagery based on Reinforcement Learning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:16868-16878.
[19]LIANG Y,YI C X,WANG G Y,et al.Semantic Segmentation of Remote Sensing Images Based on Multi-scale Semantic Encoder-Decoder Network[J].Acta Electronica Sinica,2023,51(11):3199-3214.
[20]LI X,HE H,LI X,et al.Pointflow:Flowing semantics through points for aerial image segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4217-4226.
[21]YANG X,FAN X,PENG M,et al.Semantic segmentation for remote sensing images based on an AD-HRNet model[J].International Journal of Digital Earth,2022,15(1):2376-2399.
[22]MOU L,HUA Y,ZHU X X.Relation matters:Relational context-aware fully convolutional network for semantic segmentation of high-resolution aerial images[J].IEEE Transactions on Geoscience and Remote Sensing,2020,58(11):7557-7569.
[23]FU J,LIU J,TIAN H,et al.Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3146-3154.
[24]XIE E,WANG W,YU Z,et al.SegFormer:Simple and efficient design for semantic segmentation with transformers[J].Advances in Neural Information Processing Systems,2021,34:12077-12090.
[25]LI R,ZHENG S,ZHANG C,et al.Multiattention network forsemantic segmentation of fine-resolution remote sensing images[J].IEEE Transactions on Geoscience and Remote Sensing,2020,60:1-13.
[26]SONG Q,LI J,LI C,et al.Fully attentional network for semantic segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2022:2280-2288.
[27]LI R,WANG L B,ZHANG C,et al.A2-FPN for semantic segmentation of fine-resolution remotely sensed images[J].International Journal of Remote Sensing,2022,43(3):1131-1155.
[28]LI R,ZHENG S,DUAN C,et al.Land cover classification from remote sensing images based on multi-scale fully convolutional network[J].Geo-spatial Information Science,2022,25(2):278-294.
[29]ZHANG Y,YAO T,QIU Z F,et al.Lightweight and Progressively-Scalable Networks for Semantic Segmentation[J].International Journal of Computer Vision,2023,131:2153-2171.
[1] PENG Juhong, ZHANG Zhengyue, DING Zixu, FAN Xinyu, HU Changyu, ZHAO Mingjun. Multi-view Local Language Feature and Global Feature Fusion for Conversational Aspect-based Sentiment Quadruple Analysis [J]. Computer Science, 2026, 53(4): 384-392.
[2] ZHENG Cheng, BAN Qingqing. Knowledge-assisted and Reinforced Syntax-driven for Aspect-based Sentiment Analysis [J]. Computer Science, 2026, 53(4): 406-414.
[3] LIU Dehua, YU Saixuan, QIAO Jinlan, HUANG Heqing, CHENG Wenhui. Denoising Diffusion Model-enhanced Algorithm for Battery Swap Demand Data Generation [J]. Computer Science, 2026, 53(4): 163-172.
[4] QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[5] GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157.
[6] WANG Xinyu, GAO Donghuai, NING Yuwen, XU Hao, QI Haonan. Student Behavior Detection Method Based on Improved YOLO Algorithm [J]. Computer Science, 2026, 53(3): 246-256.
[7] ZHANG Wei, LIANG Dunying, ZHOU Wanting, CHENG Xiang. CA-SFTNet:Skin Lesion Segmentation Model Based on Spatial Feature Transformation and Concentrated Attention Mechanism [J]. Computer Science, 2026, 53(3): 277-286.
[8] SONG Jianhua, HE Jiawei, ZHANG Yan. Dual-channel Source Code Vulnerability Detection Model Based on Contrastive Learning [J]. Computer Science, 2026, 53(3): 424-432.
[9] ZHUO Tienong, YING Di, ZHAO Hui. Research on Student Classroom Concentration Integrating Cross-modal Attention and Role
Interaction
[J]. Computer Science, 2026, 53(2): 67-77.
[10] XU Jingtao, YANG Yan, JIANG Yongquan. Time-Frequency Attention Based Model for Time Series Anomaly Detection [J]. Computer Science, 2026, 53(2): 161-169.
[11] HUANG Jing, WANG Teng, LIU Jian, HU Kai, PENG Xin, HUANG Yamin, WEN Yuanqiao. Multimodal Visual Detection for Underwater Sonar Target Images [J]. Computer Science, 2026, 53(2): 227-235.
[12] HAN Lei, SHANG Haoyu, QIAN Xiaoyan, GU Yan, LIU Qingsong, WANG Chuang. Constrained Multi-loss Video Anomaly Detection with Dual-branch Feature Fusion [J]. Computer Science, 2026, 53(2): 236-244.
[13] GUO Xingxing, XIAO Yannan, WEN Peizhi, XU Zhi, HUANG Wenming. Attention-based Audio-driven Digital Face Video Generation Method [J]. Computer Science, 2026, 53(2): 245-252.
[14] JI Sai, QIAO Liwei, SUN Yajie. Semantic-guided Hybrid Cross-feature Fusion Method for Infrared and Visible Light Images [J]. Computer Science, 2026, 53(2): 253-263.
[15] LIU Chenhong, LI Fenglian, YANG Jia, WANG Suzhe, CHEN Guijun. Boundary-focused Multi-scale Feature Fusion Network for Stroke Lesion Segmentation [J]. Computer Science, 2026, 53(2): 264-272.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!