计算机科学 ›› 2021, Vol. 48 ›› Issue (8): 162-168.doi: 10.11896/jsjkx.200700182

• 计算机图形学& 多媒体 • 上一篇    下一篇

基于U-Net特征融合优化策略的遥感影像语义分割方法

王施云, 杨帆   

  1. 河北工业大学电子信息工程学院 天津300401
  • 收稿日期:2020-07-28 修回日期:2020-09-19 发布日期:2021-08-10
  • 通讯作者: 杨帆(yangfan@hebut.edu.cn)
  • 基金资助:
    国家重点研发计划智能机器人专项(2019YFB1312102);河北省自然科学基金(F2019202364)

Remote Sensing Image Semantic Segmentation Method Based on U-Net Feature Fusion Optimization Strategy

WANG Shi-yun, YANG Fan   

  1. School of Electronic and Information Engineering,Hebei University of Technology,Tianjin 300401,China
  • Received:2020-07-28 Revised:2020-09-19 Published:2021-08-10
  • About author:WANG Shi-yun,born in 1994,postgra-duate.Her main research interests include intelligent information processing and so on.(18222953150@163.com)YANG Fan,born in 1966,Ph.D,professor,Ph.D supervisor.His main research interests include computer vision inspection technology,image processing and pattern recognition research.
  • Supported by:
    National Key R&D Program Intelligent Robot Special Project (2019YFB1312102) and Natural Science Foundation of Hebei Province (F2019202364).

摘要: 高分辨率遥感影像的空间分辨率高、地物信息丰富、复杂程度高、各类地物的大小尺寸不一,这为分割精度的提高带来了一定的难度。为提高遥感影像语义分割精度,解决U-Net模型在结合深层语义信息与浅层位置信息时受限的问题,文中提出了一种基于U-Net特征融合优化策略的遥感影像语义分割方法。该方法采用基于U-Net模型的编码器-译码器结构,在特征提取部分沿用U-Net模型的编码器结构,提取多个层级的特征信息;在特征融合部分保留U-Net的跳跃连接结构,同时使用提出的特征融合优化策略,实现了高层语义特征与底层位置特征的融合-优化-再融合。此外特征融合优化策略还使用空洞卷积获取了更多的全局特征,并采用Sub-Pixel卷积层代替传统转置卷积,实现了自适应上采样。所提方法在ISPRS的Potsdam数据集和Vaihingen数据集上得到了验证,其总体分割精度、Kappa系数和平均交并比mIoU 3个评价指标在Potsdam数据集上分别为86.2%,0.82,0.77,在Vaihingen数据集上分别为84.5%,0.79,0.69;相比传统的U-Net模型,所提方法的3个评价指标在Potsdam数据集上分别提高了5.8%,8%,8%,在Vaihingen数据集上分别提高了3.5%,4%,11% 。实验结果表明,基于U-Net特征融合优化策略的遥感影像语义分割方法,在Potsdam数据集和Vaihingen数据集上都能达到很好的语义分割效果,提高了遥感影像的语义分割精度。

关键词: 空洞卷积, 深度学习, 特征融合, 遥感影像, 语义分割

Abstract: Due to the high spatial resolution of high-resolution remote sensing images,rich ground objects information,high complexity,uneven distribution of target categories and different sizes of various ground objects,it is difficult to improve the segmentation accuracy.In order to improve the semantic segmentation accuracy of remote sensing images and solve the problem that U-Net model is limited when combining deep semantic information and shallow position information,a semantic segmentation me-thod of remote sensing images based on U-Net feature fusion optimization strategy is proposed.This method adopts the encoder-decoder structure based on U-Net network.In the feature extraction part of the network,the encoder structure of U-Net model is used to extract the feature information of multiple layers.In the feature fusion part,the jump connection structure of U-Net is retained,and at the same time,the feature fusion optimization strategy proposed in this paper is used to realize the fusion-optimization-refusion of high-level semantic features and low-level location features.In addition,the feature fusion optimization strategy uses dilated convolution to get more global features,and uses Sub-Pixel convolutional layer instead of traditional transposed convolution to achieve adaptive upsampling.This method is validated on the Potsdam dataset and Vaihingen dataset of ISPRS.The three evaluation indexes,overall classification accuracy,Kappa coefficient and mIoU in the verification are 86.2%,0.82,0.77 on Potsdam dataset,and 84.5%,0.79,0.69 on Vaihingen dataset.Compared with the traditional U-Net model,the three evaluation indicators are increased by 5.8%,8%,8% on Potsdam dataset,and 3.5%,4%,11% on Vaihingen dataset.Experimental results show that the remote sensing image semantic segmentation method based on the U-Net feature fusion optimization strategy has achieved good semantic segmentation effects on both the Potsdam dataset and the Vaihingen dataset,which can improve the accuracy of semantic segmentation of remote sensing images.

Key words: Deep learning, Dilated convolution, Feature fusion, Remote sensing image, Semantic segmentation

中图分类号: 

  • TP391
[1]WANG B,FAN D L.A Summary of the Research Progress of Deep Learning in Remote Sensing Image Classification and Re-cognition[J].Bulletin of Surveying and Mapping,2019,503(2):108-111,145.
[2]QIN Y Q,CHI M M.High-resolution remote sensing image semantic segmentation method combined with scene classification data[J].Computer Applications and Software,2020,37(06):126-129,134.
[3]WANG E D,QI K,LI X P,et al.Semantic segmentation method of remote sensing image based on neural network[J].Acta Optica Sinica,2019,39(12):93-104.
[4]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//The IEEE Conference on Computer Vision and Pattern Recognition.Boston,USA,2015:3431-3440.
[5]YU F,KOLTUN V.Multi-Scale Context Aggregation by Dila-ted Convolutions[C]//International Conference on Learning Representations.San Juan,Puerto Rico,2016.
[6]CHEN L C,PAPANDEROU G,KOKKINOS I,et al.DeepLab:Semantic Image Segmentation with Deep Convolutional Nets,Atrous Convolution,and Fully Connected CRFS[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2016,40(4):834-848.
[7]RONNEBERGER O,FISCHER P,BROX T,et al.U-net:Con-volutional networks for biomedical image segmentation[J].Medical Image Computing and Computer Assisted Intervention,2015,28(4):234-241.
[8]YUAN J Y.Automatic building extraction in aerial scenes using convolutional networks[J].arXiv:1602.06564,2016.
[9]SU J M,YANG L X,JING W P.Semantic segmentation method of high-resolution remote sensing image based on U-Net[J].Computer Engineering and Applications,2019,55(7):207-213.
[10]BERMAN M,TRIKI A R,BLASCHKO M B.The Lovász-softmax loss:a tractable surrogate for the optimization of the intersection-over-union measure in neural networks[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Salt Lake City:UT,2018:4413-4421.
[11]SHI W Z,CABALLERO J,HUSZAR F,et al.Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Las Vegas,NV,2016:1874-1883.
[12]MAGGIORI E,TARABALKA Y,CHARPIAT G,et al.High-resolution aerial image labeling with convolutional neural networks[C]//IEEE Transactions on Geoscience and Remote Sensing.2017:7092-7103.
[13]ZHOU J Y,ZHAO Y M.Overview of Convolutiotnal NeuralNetworks in Image Classification and Target Detection[J].Computer Engineering and Applications,2017,53(13):34-41.
[14]PASCANU R,MIKOLOV T,BENGIO Y.On the difficulty of training recurrent neural networks[C]//Proceedings of the 30th International Conference on Machine Learning(CML2013).Atlanta,GA,USA,2013:1310-1318.
[15]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[J].arXiv:1502.03167v3,2015.
[16]XU Z J,YANG X B,HE L M,et al.Multiscale remote sensing semantic segmentation network[J/OL].Computer Engineering and Applications:1-9[2020-07-18].http://kns.cnki.net/kcms/detail/11.2127.TP.20200423.1009.006.html.
[1] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[2] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[5] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[6] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 胡艳羽, 赵龙, 董祥军.
一种用于癌症分类的两阶段深度特征选择提取算法
Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification
计算机科学, 2022, 49(7): 73-78. https://doi.org/10.11896/jsjkx.210500092
[9] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
[10] 程成, 降爱莲.
基于多路径特征提取的实时语义分割方法
Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction
计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157
[11] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[12] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[13] 苏丹宁, 曹桂涛, 王燕楠, 王宏, 任赫.
小样本雷达辐射源识别的深度学习方法综述
Survey of Deep Learning for Radar Emitter Identification Based on Small Sample
计算机科学, 2022, 49(7): 226-235. https://doi.org/10.11896/jsjkx.210600138
[14] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[15] 孙福权, 崔志清, 邹彭, 张琨.
基于多尺度特征的脑肿瘤分割算法
Brain Tumor Segmentation Algorithm Based on Multi-scale Features
计算机科学, 2022, 49(6A): 12-16. https://doi.org/10.11896/jsjkx.210700217
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!