计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 230200010-6.doi: 10.11896/jsjkx.230200010
王燕, 夏创帅, 汪娜, 南佩奇
WANG Yan, XIA Chuangshuai, WANG Na, NAN Peiqi
摘要: 针对现有语义分割算法因模型复杂、计算量庞大,导致算法较难部署在移动设备的问题,提出了一种基于混合注意力的实时图像语义分割算法。该算法是非对称的编码器解码器结构,编码器部分结合深度可分离卷积与扩张卷积设计出一个高效残差单元来提取不同网络深度的图像特征,在浅层较多关注空间位置信息,在深层增强语义信息提取。解码器部分设计了混合注意力特征融合模块,使用空间注意力强化浅层的空间位置信息,使用通道注意力增强深层特征图中关键信息的表达能力,能够有效融合不同层次特征图中空间信息与上下文信息,强化语义信息的表达,减小融合过程中图像信息的损失,最后使用分类器得到分割预测图。大量实验的结果表明,该算法在Cityscapes数据集上PA和mIoU分别达到了93.2%和73.2%,在TeslaV100图像计算显卡上以1.62×106的参数量达到38FPS,在Pascal VOC 2012数据集上PA和mIoU达到了92.4%和74.8%。实验结果表明,该算法能够有效且实时地完成城市场景图片分割任务。
中图分类号:
[1]ASGARI TAGHANAKI S,ABHISHEK K,COHEN J P,et al.Deep semantic segmentation of natural and medical images:a review[J].Artificial Intelligence Review,2021,54:137-178. [2]HE X,ZHOU Y,ZHAO J,et al.Swin transformer embeddingUNet for remote sensing image semantic segmentation[J].IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-15. [3]RIZZOLI G,BARBATO F,ZANUTTIGH P.Multimodal Se-mantic Segmentation in Autonomous Driving:A Review of Current Approaches and Future Perspectives[J].Technologies,2022,10(4):90. [4]CAO X,GAO S,CHEN L,et al.Ship recognition method combined with image segmentation and deep learning feature extraction in video surveillance[J].Multimedia Tools and Applications,2020,79(13):9177-9192. [5]MA J W,LEITE F.Performance boosting of conventional deep learning-based semantic segmentation leveraging unsupervised clustering[J].Automation in Construction,2022,136:104167. [6]LEE M,KIM D,SHIM H.Threshold matters in WSSS:manipulating the activation for the robust and accurate segmentation model against thresholds [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:4330-4339. [7]LIU Y,CHENG M M,FAN D P,et al.Semantic edge detection with diverse deep supervision[J].International Journal of Computer Vision,2022,130(1):179-198. [8]YU H,YANG Z,TAN L,et al.Methods and datasets on seman-tic segmentation:A review[J].Neurocomputing,2018,304:82-103. [9]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440. [10]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE transactions on pattern analysis and machine intelligence,2017,39(12):2481-2495. [11]PASZKE A,CHAURASIA A,KIM S,et al.Enet:A deep neural network architecture for real-time semantic segmentation[J].arXiv:1606.02147,2016. [12]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//InternationalConference on Medical Image Computing and Compu-ter-assisted Intervention.2015:234-241. [13]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017. [14]ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856. [15]ZHAO H,QI X,SHEN X,et al.Icnet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision.2018:405-420. [16]YU C,WANG J,PENG C,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision.2018:325-341. [17]YU C,GAO C,WANG J,et al.Bisenet v2:Bilateral networkwith guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129(11):3051-3068. [18]WANG Y,ZHOU Q,LIU J,et al.Lednet:A lightweight encoder-decoder network for real-time semantic segmentation[C]//2019 IEEE International Conference on Image Proces-sing.2019:1860-1864. [19]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [20]CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3213-3223. [21]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes (voc) challenge[J].International Journal of Computer Vision,2010,88(2):303-338. [22]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890. [23]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017. [24]POUDEL R P K,LIWICKI S,CIPOLLA R.Fast-scnn:Fast semantic segmentation network[J].arXiv:1902.04502,2019. [25]WU Y,JIANG J,HUANG Z,et al.FPANet:Feature pyramid aggregation network for real-time semantic segmentation[J].Applied Intelligence,2022,52(3):3319-3336. [26]ELHASSAN M A M,HUANG C,YANG C,et al.DSANet:Dilated spatial attention for real-time semantic segmentation in urban street scenes[J].Expert Systems with Applications,2021,183:115090. [27]ZHUANG M,ZHONG X,GU D,et al.LRDNet:A lightweight and efficient network with refined dual attention decorder for real-time semantic segmentation [J].Neurocomputing,2021,459:349-360. |
|