计算机科学 ›› 2023, Vol. 50 ›› Issue (6A): 220700137-7.doi: 10.11896/jsjkx.220700137

• 图像处理&多媒体技术 • 上一篇    下一篇

基于边缘优化和全局建模的多路径语义分割

陈乔松1, 张羽1, 蒲柳1, 谭冲冲2, 邓欣1, 王进1, 孙开伟1, 欧阳卫华1   

  1. 1 重庆邮电大学计算机科学与技术学院数据工程与可视计算重点实验室 重庆 400065;
    2 重庆邮电大学自动化学院 重庆 400065
  • 出版日期:2023-06-10 发布日期:2023-06-12
  • 通讯作者: 张羽(1262912010@qq.com)
  • 作者简介:(chenqs@cqupt.edu.cn)
  • 基金资助:
    国家重点研发计划(2022YFE0101000)

Multi-path Semantic Segmentation Based on Edge Optimization and Global Modeling

CHEN Qiaosong1, ZHANG Yu1, PU Liu1, TAN Chongchong2, DENG Xin1, WANG Jin1, SUN Kaiwei1, OUYANG Weihua1   

  1. 1 Key Laboratory of Data Engineering, Visual Computing, School of Computer Science, Technology, Chongqing University of Posts, Telecommunications, Chongqing 400065, China;
    2 School of Automation,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
  • Online:2023-06-10 Published:2023-06-12
  • About author:CHEN Qiaosong,born in 1978,Ph.D,associate professor,is a member of China Computer Federation.His main research interests include blockchain,data mining and machine vision. ZHANG Yu,born in 1998,postgra-duate.Her main research interest is machine vision.
  • Supported by:
    National Key Research and Development Program of China(2022YFE0101000).

摘要: 目前的语义分割卷积网络中,空间信息和细节信息随着卷积层的加深而逐渐丢失,造成物体边界和细小物体的分割效果不准确。同时,卷积的局部特征能力限制了网络获取有效的全局建模能力,造成物体内部分割混淆。针对这些问题,文中设计了基于边缘优化和全局建模的多路径语义分割算法。该算法提出了多路径邻近错位融合的网络,4条不同的分辨率路径邻近之间细节信息融会,高分辨率路径尾部与低分辨率路径首部间的语义信息交融,以此减少空间信息和细节信息的丢失。文中提出了自适应边缘特征模块得到边缘特征,融入网络中间层和深度监督层,增强边缘特征的表达能力和细小物体的分割效果,提出了Transformer全局特征模块,采用不同卷积进行下采样操作,缩短自注意力序列的长度,再融合通道信息与自注意力信息,从而获取有效的高层语义的全局信息。实验结果表明,在CamVid测试集和Cityscapes验证集上mIoU值分别达到76.2%和79.1%。

关键词: 语义分割, 多路径, 边缘优化, 深度监督, 全局建模

Abstract: In the current semantic segmentation convolutional network,the spatial and detail information is gradually lost with the deepening of the convolutional layer,resulting in inaccurate segmentation of boundary parts and small objects.Meanwhile,the local feature capability of convolution restricts the network's ability to obtain effective global modeling,resulting in confusion of internal segmentation of objects.Aiming at these problems,a multi-path semantic segmentation algorithm based on edge optimization and global modeling is designed.The algorithm proposes a multi-path adjacent dislocation fusion network.Four branches of different resolutions are interlaced and fused adjacently.In order to reduce the loss of spatial information and detail information,the detail information between the adjacent four different resolution paths is fused,and the semantic information is fused between the tail of the high-resolution path and the header of the low-resolution path.The adaptive edge feature module is proposed to obtain edge features which are integrated into the middle layer and depth supervision layer of the network to enhance the expressive ability of edge features and the segmentation effect of small objects.The Transformer global feature module is proposed,which uses different convolutions for downsampling operations to reduce the length of self-attention sequences and fuse channel information and self-attention information to obtain effective high-level semantic global information.Experimental results show that the mIoU value on the CamVid test set reaches 76.2%,and the mIoU value on the Cityscapes validation set reaches 79.1%.

Key words: Semantic segmentation, Multi-path, Edge optimization, Deep supervision, Global modeling

中图分类号: 

  • TP391.4
[1]ZHAN Z Y,AN Y J,CUI W C.Image Threshold Segmentation Algorithms and Comparative Research[J].Information and Communication,2017(4):86-89.
[2]LIANG Z X,WANG X B,HE T,et al.Research and implementation of instance segmentation and edge optimization algorithms[J].Journal of Graphics,2020,41(6):939-946.
[3]ROTHER C,KOLMOGOROV V,BLAKE A."“GrabCut” interactive foreground extraction using iterated graph cuts[J].ACM Transactions on Graphics(TOG),2004,23(3):309-314.
[4]CANNY J.A computational approach to edge detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1986(6):679-698.
[5]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer vision and Pattern Recognition.2015:3431-3440.
[6]WANG Y R,CHEN Q L,WU J J.Research on Image Semantic Segmentation for Complex Environments[J].Computer Science,2019,46(9):36-46.
[7]RONNEBERGER O,FISCHER P,BROXT.U-net:Convolu-tional networks for biomedical image segmentation[C]//International Conference on Medical image computing and computer-assisted intervention.Cham:Springer,2015:234-241.
[8]NOH H,HONG S,HAN B.Learning deconvolution network for semantic segmentation[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1520-1528.
[9]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(12):2481-2495.
[10]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[11]CHEN L C,ZHU Y,PAPANDREOU G,et al.Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:801-818.
[12]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[13]YU C,WANG J,PENGC,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:325-341.
[14]FU J,LIU J,TIAN H,et al.Dual attention network for scene segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:3146-3154.
[15]HUANG Z,WANG X,HUANG L,et al.Ccnet:Criss-cross attention for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:603-612.
[16]VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing systems.2017:5998-6008.
[17]XIE E,WANG W,YU Z,et al.SegFormer:Simple and efficient design for semantic segmentation with transformers[J].Advances in Neural Information Processing Systems,2021,34:12077-12090.
[18]SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training re-gion-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:761-769.
[19]YU C,GAO C,WANG J,et al.Bisenet v2:Bilateral networkwith guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129(11):3051-3068.
[20]PENG C,ZHANG X,YU G,et al.Large kernel matters--im-prove semantic segmentation by global convolutional network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4353-4361.
[21]LIN G,MILAN A,SHEN C,et al.Refinenet:Multi-path refinement networks for high-resolution semantic segmentation[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2017:1925-1934.
[22]LI H,XIONG P,FAN H,et al.Dfanet:Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9522-9531.
[23]POUDEL R P K,LIWICKI S,CIPOLLA R.Fast-scnn:Fast semantic segmentation network[J].arXiv:1902.04502,2019.
[24]SUN K,ZHAO Y,JIANG B,et al.High-resolution representationfor learning pixels and regions[J].arXiv:1904.04514,2019.
[25]BAI S,KOLTUN V,KOLTER J Z.Multiscale deep equilibrium models[J].Advances in Neural Information Processing Systems,2020,33:5238-5250.
[26]ORSIC M,KRESO I,BEVANDICP,et al.In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12607-12616.
[27]DING X,CHEN H,ZHANG X,et al.Repmlpnet:Hierarchical vision mlp with re-parameterized locality[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:578-587.
[28]YURTKULU S C,AHIN Y H,UNAL G.Semantic segmentation with extended DeepLabv3 architecture[C]//2019 27th Signal Processing and Communications Applications Conference(SIU).IEEE,2019:1-4.
[29]LI G,YUN I,KIM J,et al.Dabnet:Depth-wise asymmetric bottleneck for real-time semantic segmentation[J].arXiv:1907.11357,2019.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!