计算机科学 ›› 2024, Vol. 51 ›› Issue (5): 134-142.doi: 10.11896/jsjkx.230200134

• 计算机图形学&多媒体 • 上一篇    下一篇

基于多尺度注意力的遥感影像建筑物提取研究

赫晓慧1, 周涛2, 李盼乐2, 常静2, 李加冕2   

  1. 1 郑州大学地球科学与技术学院 郑州 450052
    2 郑州大学计算机与人工智能学院 郑州 450001
  • 收稿日期:2023-02-19 修回日期:2023-08-17 出版日期:2024-05-15 发布日期:2024-05-08
  • 通讯作者: 赫晓慧(13137052075@163.com)
  • 基金资助:
    河南省重大科技专项——面向超算的黄河模拟器构建与服务关键技术研究(201400210900)

Study on Building Extraction from Remote Sensing Image Based on Multi-scale Attention

HE Xiaohui1, ZHOU Tao2, LI Panle2, CHANG Jing2, LI Jiamian2   

  1. 1 School of Earth Science and Technology,Zhengzhou University,Zhengzhou 450052,China
    2 School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China
  • Received:2023-02-19 Revised:2023-08-17 Online:2024-05-15 Published:2024-05-08
  • About author:HE Xiaohui,born in 1978,professor,Ph.D supervisor.Her main research interests include artificial intelligence,computer vision,remote sensing image processing and data mining.
  • Supported by:
    Henan Province Major Science and Technology SpecialProject--Research on Key Technologies for Constructing and Servicing the Yellow River Simulator for Supercomputing(201400210900).

摘要: 基于深度学习的遥感影像建筑物提取方法具有覆盖范围广、运算效率高的特点,在城市建设、灾害防治等方面有着重要的实际意义。主流方法大多采用多尺度特征融合的方式使神经网络能够学习到更丰富的语义信息,然而由于受到多尺度特征的复杂性以及其他类别地物的干扰,该类方法往往存在着目标漏检与噪声密集的问题。对此,文中设计并实现了一种结合注意力机制的特征解译模型MGA-ResNet50(MGAR)。该方法的核心在于利用多头注意力对高等级语义信息进行分层加权处理,以提取出表征效果较好的最优特征组合;而后使用门控结构将每维特征图与对应编码端的低级语义信息融合,来解决局部建筑物细节信息丢失的问题。在Massachusetts Building,WHU Building等公开数据集上的实验结果表明,与RAPNet,GAMNet,GSM等较为先进的多尺度特征融合方法相比,所提算法能够取得更高的F1与IoU指标。

关键词: 深度学习, 建筑物提取, 多尺度特征, 多头注意力, 门控机制

Abstract: Building extraction from remote sensing images based on deep learning has the characteristics of wide coverage and high computational efficiency,and it plays an important role in urban construction,disaster prevention and other aspects.Most of the mainstream methods use multi-scale feature fusion to enable the neural network to learn more abundant semantic information.However,due to the complexity of multi-scale features and the interference of other ground objects,this kind of methods often lead to target missing and noise-intensive.To this end,this paper proposes a feature interpretation model MGA-ResNet50(MGAR) that combines attention mechanism.The core of the method is to use the multihead attention to process the hierarchical weighting of high-level semantic information,so as to extract the optimal feature combination with relatively better representation effect.Then use the gating structure to fuse the feature map of each dimension with the low-level semantic information of the corresponding encoder to compensate for the loss of local building details.Experimental results on public datasets such as Massachusetts Building and WHU Building show that the proposed algorithm can achieve higher F1 and IoU than the more advanced multi-scale feature fusion methods such as RAPNet,GAMNet and GSM.

Key words: Deep learning, Building extraction, Multi-scale feature, Multihead attention, Gating mechanism

中图分类号: 

  • TP391.4
[1]ZHANG Y,FEI X,WANG J,et al.Overview of building extraction methods based on high-resolution remote sensing images [J].Geomatics &Spatial Information Technology,2020,43(4):76-79.
[2]LONG J,SHELHAMER E,DARRELL T.Fully Convolutional Networks for Semantic Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):640-651.
[3]ZHANG C,AN R,MA L.Improved U-Net remote sensingimage building change detection[J].Computer Engineering and Application,2021,57(3):239-246.
[4]HE Z,DING H,AN B.Cavity convolution E-Unet algorithm for building extraction from high-resolution remote sensing images [J].Journal of Geodesy and Geoinformation Science,2022,51(3):457-467.
[5]ZHANG C,GE Y,JIANG X.Building extraction from high-resolution remote sensing images based on sparse constraint SegNet [J].Journal of Xi'an University of Science and Technology,2020,40(3):441-448.
[6]WU L,HU X.Automatic building detection based on multi-scaleand multi-feature high spatial resolution remote sensing image [J].Remote Sensing of Land and Resources,2019,31(1):71-78.
[7]ZHANG Y,WANG X,ZHANG Z,et al.A remote sensingimage building extraction method based on boundary perception [J].Journal of Xi'an University of Electronic Science and Technology(Natural Science Edition),2022,49(1):236-244.
[8]LIU H,ZHANG C,GE Y,et al.Multi-scale feature fusion depth learning building extraction method [J].Geospatial Information,2022,20(2):97-100.
[9]ZHANG Y,YAN Q,DENG F.Multi-path RSU network method for building extraction from high-resolution remote sensingimage[J].Journal of Geodesy and Geoinformation Science,2022,51(1):135-144.
[10]LIU D,ZHANG H,CHENG D,et al.Building extraction me-thod based on attention mechanism [J].Remote Sensing Information,2021,36(4):119-124.
[11]ZHANG Y,CHENG C,YANG S,et al.Building extraction from remote sensing images based on dual attention mechanism model [J].Science of Surveying and Mapping,2022,47(4):129-136,174.
[12]LI H,LI Z,ZHANG D.Object-oriented building extraction at optimal scale [J].Remote Sensing Information,2022,37(3):72-76.
[13]CHEN K,GAO X,YAN M,et al.Pixel level building extraction of aerial image based on codec network [J].National Remote Sensing Bulletin,2020,24(9):1134-1142.
[14]HE Q,MENG Y,LI H.Multi-level code-decode network remote sensing image building segmentation [J].Application Research of Computers,2021,38(8):2510-2514.
[15]BIANCHINI M,SCARSELLI F.On the complexity of neuralnetwork classifiers:A comparison between shallow and deep architectures[J].IEEE Transactions on Neural Networks and Learning Systems,2014,25(8):1553-1565.
[16]RAGHU M,POOLE B,KLEINBERG J,et al.On the expressive power of deep neural networks[C]//Proceedings of the 34th International Conference on Machine Learning(Volume 70).Sydney:PMLR,2017:2847-2854.
[17]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:770-778.
[18]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[19]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Honolulu:IEEE Press,2017:2117-2125.
[20]GU Y,YAN F.Building extraction based on different skeleton UNet++networks [J].Journal of University of Chinese Aca-demy of Sciences,2022,39(4):512-523.
[21]JI S,WEI S,LU M.Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set[J].IEEE Transactions on Geoscience and Remote Sensing,2018,57(1):574-586.
[22]TIAN Q,ZHAO Y,LI Y,et al.Multiscale building extractionwith refined attention pyramid networks[J].IEEE Geoscience and Remote Sensing Letters,2021,19:1-5.
[23]ZHENG Z,ZHANG X,XIAO P,et al.Integrating gate and attention modules for high-resolution image semantic segmentation[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2021,14:4530-4546.
[24]XU L,LI Y,XU J,et al.Gated spatial memory and centroid-aware network for building instance extraction[J].IEEE Tran-sactions on Geoscience and Remote Sensing,2021,60:1-14.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!