计算机科学 ›› 2023, Vol. 50 ›› Issue (9): 202-209.doi: 10.11896/jsjkx.220800086

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于多尺度特征融合的遥感图像建筑物提取算法研究

陈国军, 岳雪燕, 朱燕宁, 付云鹏   

  1. 中国石油大学(华东)计算机科学与技术学院 山东 青岛266580
  • 收稿日期:2022-08-09 修回日期:2022-12-10 出版日期:2023-09-15 发布日期:2023-09-01
  • 通讯作者: 岳雪燕(yue@s.upc.edu.cn)
  • 作者简介:(88326309@qq.com)
  • 基金资助:
    山西省交通建设科技项目(2019-2-8)

Study on Building Extraction Algorithm of Remote Sensing Image Based on Multi-scale Feature Fusion

CHEN Guojun, YUE Xueyan, ZHU Yanning, FU Yunpeng   

  1. College of Computer Science and Technology,China University of Petroleum(East China),Qingdao,Shandong 266580,China
  • Received:2022-08-09 Revised:2022-12-10 Online:2023-09-15 Published:2023-09-01
  • About author:CHEN Guojun,born in 1968,associate professor,is a member of China Computer Federation.His main research interests include graphics and image processing,virtual reality,and BIM technology.
    YUE Xueyan,born in 1998,postgra-duate.Her main research interest is computer vision.
  • Supported by:
    Transportation Construction Science and Technology Project in Shanxi Province(2019-2-8).

摘要: 由于高分辨率遥感图像中的建筑物尺寸多样,且背景复杂,因此在对遥感图像中的建筑物进行提取时,往往存在细节丢失、边缘模糊等问题,从而影响模型的分割精度。为了解决这些问题,提出了具有空间和语义信息的双分支架构网络B2Net。首先,在语义信息分支上建立交叉特征融合模块,充分捕获上下文信息,以聚合更多的多尺度语义特征;其次,在空间信息分支上将空洞卷积和深度可分离卷积进行组合,提取图像的多尺度空间特征,并通过优化膨胀率扩大网络的感受野;最后,构建内容感知注意力模块,对图像中的高频和低频内容进行自适应选择,以达到细化建筑物分割边缘的效果。在两个建筑物数据集上对B2Net进行训练与测试。在WHU数据集上,与基线模型相比,B2Net在精度、召回率、F1分数以及交并比上皆达到了最佳效果,分别为98.60%,99.40%,99.30%,88.50%;在Massachusetts建筑物数据集上,4个指标比BiSeNet分别提高了0.9%,1.9%,1.7%,2.2%。实验结果证明,B2Net可以更好地捕获空间细节信息和高级语义信息,提高了复杂背景下的建筑物进行分割精度,满足了对建筑物快速提取的需求。

关键词: 建筑物提取, 特征融合, 空洞卷积, 深度可分离卷积, 内容感知注意力

Abstract: Because of the various size of buildings and complicated background in high-resolution remote sensing images,there are some problems such as loss of details and blurring of edges when extracting buildings in remote sensing images,which affect the segmentation accuracy of the model.In order to solve these problems,this paper proposes a two-branch architecture network B2Net with spatial and semantic information branches.Firstly,the cross feature fusion module is provided in the semantic information branch to fully capture the context information to aggregate more multi-scale semantic features.Secondly,in the spatial branch,we combine the atrous convolution and depthwise separable convolution to extract the multi-scale spatial features of the image,and optimize the dilated rate to expand the receptive field.Finally,we use the content aware attention module to adaptively select the high-frequency and low-frequency content in the image to achieve the effect of refining the edges of building segmentation.We train and test the B2Net on two building datasets.On the WHU dataset,compared with the baseline model,the B2Net achieves the best result in precision,recall,F1 score and IoU,which is 98.60%,99.40%,99.30%,and 88.50%,respectively.On the Massachusetts building dataset,the four indicators are 0.9%,1.9%,1.7% and 2.2% higher than BiSeNet,respectively.Experiments show that B2Net can better capture spatial detail and high-level semantic information,improve the segmentation accuracy of buildings in complicated backgrounds,and meet the needs of rapid extraction of buildings.

Key words: Building extraction, Feature fusion, Atrous convolution, Depthwise separable convolution, Content aware attention

中图分类号: 

  • TP751
[1]ZOU W,JING W,CHEN G,et al.A survey of big data analytics for smart forestry[J].IEEE Access,2019,7:46621-46636.
[2]HUERTAS A,NEVATIA R.Detecting buildings in aerialimages[J].Computer Vision,Graphics,and Image Processing,1988,41(2):131-152.
[3]PENG J,LIU Y C.Model and context-driven building extraction in dense urban aerial images[J].International Journal of Remote Sensing,2005,26(7):1289-1307.
[4]LEVITT S,AGHDASI F.An investigation into the use of wavelets and scaling for the extraction of buildings in aerial images[C]//Proceedings of the 1998 South African Symposium on Communications and Signal Processing-COMSIG'98(Cat.No.98EX214).IEEE,1998:133-138.
[5]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[6]TURLAPATY A,GOKARAJU B,DU Q,et al.A hybrid ap-proach for building extraction from spaceborne multi-angular optical imagery[J].IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing,2012,5(1):89-100.
[7]SUMER E,TURKER M.An adaptive fuzzy-genetic algorithmapproach for building detection using high-resolution satellite images[J].Computers,Environment and Urban Systems,2013,39:48-62.
[8]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[9]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical Image Computing and Compu-ter-assisted Intervention.Cham:Springer,2015:234-241.
[10]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Semantic image segmentation with deep convolutional nets and fully connected crfs[J].arXiv:1412.7062,2014.
[11]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848.
[12]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017.
[13]CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:801-818.
[14]LI L,LIANG J,WENG M,et al.A multiple-feature reuse network to extract buildings from remote sensing imagery[J].Remote Sensing,2018,10(9):1350-1367.
[15]CHAURASIA A,CULURCIELLO E.Linknet:Exploiting en-coder representations for efficient semantic segmentation[C]//2017 IEEE Visual Communications and Image Processing(VCIP).IEEE,2017:1-4.
[16]ZHONG Z,LIN Z Q,BIDART R,et al.Squeeze-and-attentionnetworks for semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:13065-13074.
[17]CAO J,CHEN Q,GUO J,et al.Attention-guided context feature pyramid network for object detection[J].arXiv:2005.11475,2020.
[18]DAI Y,GIESEKE F,OEHMCKE S,et al.Attentional featurefusion[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision.2021:3560-3569.
[19]PARK J,WOO S,LEE J Y,et al.Bam:Bottleneck attention module[J].arXiv:1807.06514,2018.
[20]WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[21]ROY A G,NAVAB N,WACHINGER C.Concurrent spatial and channel ‘squeeze & excitation'in fully convolutional networks[C]//International Conference on Medical Image Computing and Computer-assisted Intervention.Cham:Springer,2018:421-429.
[22]HOU Q,ZHOU D,FENG J.Coordinate attention for efficientmobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:13713-13722.
[23]LI C L,HUANG F H,HU W,et al.Building Extraction from High-Resolution Remote Sensing Image based on Res_AttentionUnet[J].Journal of Geo-Information Science,2021,23(12):2232-2243.
[24]XU C Y,FAN S S,ZHU H.Semantic Segmentation of Remote Sensing lmages Using The Channel Domain Attention Mechanism Deeplabv3+ Algorithm [J].Control Engineering,2023,30(2):368-375.
[25]YU C,WANG J,PENG C,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:325-341.
[26]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[J].arXiv:1511.07122,2015.
[27]CHOLLET F.Xception:Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1251-1258.
[28]DU S J,DU S H,LIU B,et al.Incorporating DeepLabv3+ and object-based image analysis for semantic segmentation of very high resolution remote sensing images[J].International Journal of Digital Earth,2021,14(3):357-378.
[29]ZHANG L,DONG R,YUAN S,et al.Making low-resolutionsatellite images reborn:a deep learning approach for super-resolution building extraction[J].Remote Sensing,2021,13(15):2872.
[30]ZHANG R.Making convolutional networks shift-invariant again[C]//International Conference on Machine Learning.PMLR,2019:7324-7334.
[31]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[32]JI S P,WEI S Q.Building extraction via convolutional neural networks from an open remote sensing building dataset[J].Journal of Geomatics,2019,48(4):448-459.
[33]MNIH V.Machine learning for aerial image labeling[D].University of Toronto(Canada),2013.
[34]KANG W,XIANG Y,WANG F,et al.EU-Net:An efficientfully convolutional network for building extraction from optical remote sensing images[J].Remote Sensing,2019,11(23):2813.
[35]WANG J,SUN K,CHENG T,et al.Deep high-resolution representation learning for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,43(10):3349-3364.
[36]ZHAO H,QI X,SHEN X,et al.Icnet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:405-420.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!