Computer Science ›› 2023, Vol. 50 ›› Issue (11A): 230200010-6.doi: 10.11896/jsjkx.230200010

• Image Processing & Multimedia Technology • Previous Articles     Next Articles

Real-time Image Semantic Segmentation Algorithm Based on Hybrid Attention

WANG Yan, XIA Chuangshuai, WANG Na, NAN Peiqi   

  1. School of Computer and Communication,Lanzhou University of Technology,Lanzhou 730050,China
  • Published:2023-11-09
  • About author:WANG Yan,born in 1971,master,professor,is a member of China Computer Federation.Her main research interests include pattern recognition and artificial intelligence.
    XIA Chuangshuai,born in 1998,master.His main research interests include pattern recognition and artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(61863025).

Abstract: The existing semantic segmentation algorithms are difficult to deploy on mobile devices due to the complex model and a large amount of computation.A new semantic segmentation algorithm based on hybrid attention is proposed.This algorithm is an asymmetric encoder-decoder structure.The encoder part combines depth-wise separable convolution anddilated convolution to design an efficient residual module to extract image features at different levels of the network.It pays more attention to spatial position information in the shallow layer and enhances semantic information extraction in the deep layer.In the decoder part,a hybrid attention feature fusion module is designed,which uses spatial attention to strengthen the spatial location information in the shallow layer and channel attention to enhance the expression ability of key information in the deep feature map.It can effectively integrate the spatial information and context information in the feature map of different levels,strengthen the expression of semantic information,and reduce the loss of image information in the fusion process.Finally,the segmentation results are predicted by using the classifier.A large number of experiments show that the proposed algorithm achieves 93.2% PA and 73.2% mIoU in Cityscapes,respectively,and achieves 38FPS with 1.62×106 reference on Tesla V100 GPU.In Pascal VOC 2012 data set,PA and mIoU reaches 92.4% and 74.8% respectively.Experimental results show that this algorithm can effectively and quickly complete the task of city scene image segmentation.

Key words: Deep learning, Semantic segmentation, Real-time, Feature fusion, Attention mechanism

CLC Number: 

  • TP391
[1]ASGARI TAGHANAKI S,ABHISHEK K,COHEN J P,et al.Deep semantic segmentation of natural and medical images:a review[J].Artificial Intelligence Review,2021,54:137-178.
[2]HE X,ZHOU Y,ZHAO J,et al.Swin transformer embeddingUNet for remote sensing image semantic segmentation[J].IEEE Transactions on Geoscience and Remote Sensing,2022,60:1-15.
[3]RIZZOLI G,BARBATO F,ZANUTTIGH P.Multimodal Se-mantic Segmentation in Autonomous Driving:A Review of Current Approaches and Future Perspectives[J].Technologies,2022,10(4):90.
[4]CAO X,GAO S,CHEN L,et al.Ship recognition method combined with image segmentation and deep learning feature extraction in video surveillance[J].Multimedia Tools and Applications,2020,79(13):9177-9192.
[5]MA J W,LEITE F.Performance boosting of conventional deep learning-based semantic segmentation leveraging unsupervised clustering[J].Automation in Construction,2022,136:104167.
[6]LEE M,KIM D,SHIM H.Threshold matters in WSSS:manipulating the activation for the robust and accurate segmentation model against thresholds [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:4330-4339.
[7]LIU Y,CHENG M M,FAN D P,et al.Semantic edge detection with diverse deep supervision[J].International Journal of Computer Vision,2022,130(1):179-198.
[8]YU H,YANG Z,TAN L,et al.Methods and datasets on seman-tic segmentation:A review[J].Neurocomputing,2018,304:82-103.
[9]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[10]BADRINARAYANAN V,KENDALL A,CIPOLLA R.Segnet:A deep convolutional encoder-decoder architecture for image segmentation[J].IEEE transactions on pattern analysis and machine intelligence,2017,39(12):2481-2495.
[11]PASZKE A,CHAURASIA A,KIM S,et al.Enet:A deep neural network architecture for real-time semantic segmentation[J].arXiv:1606.02147,2016.
[12]RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//InternationalConference on Medical Image Computing and Compu-ter-assisted Intervention.2015:234-241.
[13]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[14]ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremely efficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856.
[15]ZHAO H,QI X,SHEN X,et al.Icnet for real-time semantic segmentation on high-resolution images[C]//Proceedings of the European Conference on Computer Vision.2018:405-420.
[16]YU C,WANG J,PENG C,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision.2018:325-341.
[17]YU C,GAO C,WANG J,et al.Bisenet v2:Bilateral networkwith guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129(11):3051-3068.
[18]WANG Y,ZHOU Q,LIU J,et al.Lednet:A lightweight encoder-decoder network for real-time semantic segmentation[C]//2019 IEEE International Conference on Image Proces-sing.2019:1860-1864.
[19]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[20]CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3213-3223.
[21]EVERINGHAM M,VAN GOOL L,WILLIAMS C K I,et al.The pascal visual object classes (voc) challenge[J].International Journal of Computer Vision,2010,88(2):303-338.
[22]ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[23]CHEN L C,PAPANDREOU G,SCHROFF F,et al.Rethinking atrous convolution for semantic image segmentation[J].arXiv:1706.05587,2017.
[24]POUDEL R P K,LIWICKI S,CIPOLLA R.Fast-scnn:Fast semantic segmentation network[J].arXiv:1902.04502,2019.
[25]WU Y,JIANG J,HUANG Z,et al.FPANet:Feature pyramid aggregation network for real-time semantic segmentation[J].Applied Intelligence,2022,52(3):3319-3336.
[26]ELHASSAN M A M,HUANG C,YANG C,et al.DSANet:Dilated spatial attention for real-time semantic segmentation in urban street scenes[J].Expert Systems with Applications,2021,183:115090.
[27]ZHUANG M,ZHONG X,GU D,et al.LRDNet:A lightweight and efficient network with refined dual attention decorder for real-time semantic segmentation [J].Neurocomputing,2021,459:349-360.
[1] ZHAO Mingmin, YANG Qiuhui, HONG Mei, CAI Chuang. Smart Contract Fuzzing Based on Deep Learning and Information Feedback [J]. Computer Science, 2023, 50(9): 117-122.
[2] LI Haiming, ZHU Zhiheng, LIU Lei, GUO Chenkai. Multi-task Graph-embedding Deep Prediction Model for Mobile App Rating Recommendation [J]. Computer Science, 2023, 50(9): 160-167.
[3] HUANG Hanqiang, XING Yunbing, SHEN Jianfei, FAN Feiyi. Sign Language Animation Splicing Model Based on LpTransformer Network [J]. Computer Science, 2023, 50(9): 184-191.
[4] CHEN Guojun, YUE Xueyan, ZHU Yanning, FU Yunpeng. Study on Building Extraction Algorithm of Remote Sensing Image Based on Multi-scale Feature Fusion [J]. Computer Science, 2023, 50(9): 202-209.
[5] ZHU Ye, HAO Yingguang, WANG Hongyu. Deep Learning Based Salient Object Detection in Infrared Video [J]. Computer Science, 2023, 50(9): 227-234.
[6] YI Liu, GENG Xinyu, BAI Jing. Hierarchical Multi-label Text Classification Algorithm Based on Parallel Convolutional Network Information Fusion [J]. Computer Science, 2023, 50(9): 278-286.
[7] LUO Yuanyuan, YANG Chunming, LI Bo, ZHANG Hui, ZHAO Xujian. Chinese Medical Named Entity Recognition Method Incorporating Machine ReadingComprehension [J]. Computer Science, 2023, 50(9): 287-294.
[8] LI Ke, YANG Ling, ZHAO Yanbo, CHEN Yonglong, LUO Shouxi. EGCN-CeDML:A Distributed Machine Learning Framework for Vehicle Driving Behavior Prediction [J]. Computer Science, 2023, 50(9): 318-330.
[9] ZHANG Yian, YANG Ying, REN Gang, WANG Gang. Study on Multimodal Online Reviews Helpfulness Prediction Based on Attention Mechanism [J]. Computer Science, 2023, 50(8): 37-44.
[10] SONG Xinyang, YAN Zhiyuan, SUN Muyi, DAI Linlin, LI Qi, SUN Zhenan. Review of Talking Face Generation [J]. Computer Science, 2023, 50(8): 68-78.
[11] WANG Xu, WU Yanxia, ZHANG Xue, HONG Ruize, LI Guangsheng. Survey of Rotating Object Detection Research in Computer Vision [J]. Computer Science, 2023, 50(8): 79-92.
[12] ZHOU Ziyi, XIONG Hailing. Image Captioning Optimization Strategy Based on Deep Learning [J]. Computer Science, 2023, 50(8): 99-110.
[13] TENG Sihang, WANG Lie, LI Ya. Non-autoregressive Transformer Chinese Speech Recognition Incorporating Pronunciation- Character Representation Conversion [J]. Computer Science, 2023, 50(8): 111-117.
[14] ZHANG Xiao, DONG Hongbin. Lightweight Multi-view Stereo Integrating Coarse Cost Volume and Bilateral Grid [J]. Computer Science, 2023, 50(8): 125-132.
[15] WANG Jiahao, ZHONG Xin, LI Wenxiong, ZHAO Dexin. Human Activity Recognition with Meta-learning and Attention [J]. Computer Science, 2023, 50(8): 193-201.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!