Computer Science ›› 2024, Vol. 51 ›› Issue (11): 174-181.doi: 10.11896/jsjkx.231000009

• Computer Graphics & Multimedia • Previous Articles     Next Articles

High-precision Real-time Semantic Segmentation Algorithm Architecture for Autonomous Driving

GENG Huantong1,2,3, LI Jiaxing1, JIANG Jun1, LIU Zhenyu1, FAN Zichen4   

  1. 1 School of Computer Science,Nanjing University of Information Science & Technology,Nanjing 210044,China
    2 China Meteorological Administration Radar Meteorology Key Laboratory,Nanjing 210044,China
    3 School of Information Technology,Jiangsu Open University,Nanjing 210036,China
    4 School of Software,Nanjing University of Information Science & Technology,Nanjing 210044,China
  • Received:2023-10-07 Revised:2024-04-17 Online:2024-11-15 Published:2024-11-06
  • About author:GENG Huantong,born in 1973,professor,Ph.D supervisor,is a senior member of CCF(No.12356S).His main research interests include multi-objective optimization and deep learning.
  • Supported by:
    National Natural Science Foundation of China(42375145) and Open Grants of China Meteorological Administration Radar Meteorology Key Laboratory(2023LRM-A02).

Abstract: The proportional integration differentiation(PID) semantic segmentation architecture mitigates the problem of overshooting in the dual-branch architecture,where fine-grained features are easily overwhelmed by surrounding contextual information.However,the high-resolution boundary branch in this architecture significantly impacts the inference speed.To address this issue,an efficient PID architecture based on spatial attention mechanisms and a lightweight auxiliary semantic branch is proposed.The designed lightweight attention fusion module is used to extract precise contextual information and guide the fusion of various feature information.Additionally,a fast aggregation pyramid pooling module is introduced to rapidly aggregate semantic information across multiple scales.Finally,a deep supervision training strategy,combined with the canny edge detection operator,is designed to enhance the training effectiveness.In comparison to the baseline,the proposed model achieves a 6% increase in accuracy at the cost of a slightly increased latency.It strikes a good balance between accuracy and speed on the Cityscapes,CamVid,and KITTI datasets,outperforming existing models in the same speed range.Notably,the model achieves an accuracy of 78.5% at 120.9 frames/s on the Cityscapes test set.

Key words: Real-time semantic segmentation, Autonomous driving, Overshoot, Spatial attention mechanism, Edge detection

CLC Number: 

  • TP391
[1] FENG D,HAASE-SCHÜTZ C,ROSENBAUM L,et al.Deepmulti-modal object detection and semantic segmentation for autonomous driving:Datasets,methods,and challenges[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(3):1341-1360.
[2] ASGARI T S,ABHISHEK K,COHEN J P,et al.Deep semantic segmentation of natural and medical images:a review[J].Artificial Intelligence Review,2021,54:137-178.
[3] YUAN X,SHI J,GU L.A review of deep learning methods for semantic segmentation of remote sensing imagery[J].Expert Systems with Applications,2021,169:114417.
[4] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[5] CHEN Q S,ZHANG Y,PU L,et al.Multi-path Semantic Segmentation Based on Edge Optimization and Global Modeling[J].Computer Science,2023,50(S1):2207137.
[6] WANG Y,ZHOU Q,LIU J,et al.Lednet:A lightweight encoder-decoder network for real-time semantic segmentation[C]//2019 IEEE International Conference on Image Processing(ICIP).IEEE,2019:1860-1864.
[7] LI X,YOU A,ZHU Z,et al.Semantic flow for fast and accurate scene parsing[C]//Computer Vision-ECCV 2020:16th Euro-pean Conference,Glasgow,UK,August 23-28,2020,Procee-dings,Part I 16.Springer International Publishing,2020:775-793.
[8] XU J,XIONG Z,BHATTACHARYYA S P.PIDNet:A Real-Time Semantic Segmentation Network Inspired by PID Controllers[C]//Proceedings of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2023:19529-19539.
[9] BEZDEK J C.A convergence theorem for the fuzzy ISODATAclustering algorithms[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1980(1):1-8.
[10] EMARA T,MUNIM H E,ABBAS H M,et al.LiteSeg:A Novel Lightweight ConvNet for Semantic Segmentation[J].arXiv:1912.06683,2019.
[11] NIRKIN Y,WOLF L,HASSNER T.Hyperseg:Patchwise hy-pernetwork for real-time semantic segmentation[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4061-4070.
[12] RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015:18th International Conference,Munich,Germany,October 5-9,2015,Proceedings,Part III 18.Springer Interna-tional Publishing,2015:234-241.
[13] FAN M,LAI S,HUANG J,et al.Rethinking bisenet for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:9716-9725.
[14] PENG J,LIU Y,TANG S,et al.Pp-liteseg:A superior real-time semantic segmentation model[J].arXiv:2204.02681,2022.
[15] YU C,WANG J,PENG C,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:325-341.
[16] YU C,GAO C,WANG J,et al.Bisenet v2:Bilateral networkwith guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129(11):3051-3068.
[17] HONG Y D,PAN H H,SUN W C,et al.Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[J].arXiv:2101.06085,2021.
[18] HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[19] LI H,XIONG P,FAN H,et al.Dfanet:Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9522-9531.
[20] SONG Q,MEI K,HUANG R.AttaNet:Attention-augmentednetwork for fast and accurate scene parsing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:2567-2575.
[21] ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890.
[22] JOCHER G.YOLOv5 by Ultralytics(Version 7.0)[EB/OL].https://doi.org/10.5281/zenodo.3908559.
[23] TAKIKAWA T,ACUNA D,JAMPANI V,et al.Gated-scnn:Gated shape cnns for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:5229-5238.
[24] CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3213-3223.
[25] BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:A high-definition ground truth database[J].Pattern Recognition Letters,2009,30(2):88-97.
[26] ABU ALHAIJA H,MUSTIKOVELA S K,MESCHEDER L,et al.Augmented reality meets computer vision:Efficient data generation for urban driving scenes[J].International Journal of Computer Vision,2018,126:961-972.
[27] RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115:211-252.
[28] GENG H,JIANG J,SHEN J,et al.Cascading Alignment forUnsupervised Domain-Adaptive DETR with Improved DeNoi-sing Anchor Boxes[J].Sensors,2022,22(24):9629.
[29] GU Y H,HAO J,CHEN B.Semi-supervised Semantic Segmentation for High-resolution Remote Sensing Images Based on DataFusion[J].Computer Science,2023,50(S1):22050001-6.
[30] CHEN L,XU G,FU N N,et al.Research on 3D Point Cloud Semantic Segmentation Method Fused with Edge Detection[J].Journal of Chongqing Technology and Business University(Na-tural Science Edition),2022,39(5):1-9.
[31] ORSIC M,KRESO I,BEVANDIC P,et al.In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12607-12616.
[32] CHEN W,GONG X,LIU X,et al.Fasterseg:Searching for faster real-time semantic segmentation[J].arXiv:1912.10917,2019.
[33] WANG Y,CHEN S,BIAN H,et al.Deep Multi-Resolution Net-work for Real-Time Semantic Segmentation in Street Scenes[C]//2023 International Joint Conference on Neural Networks(IJCNN).IEEE,2023:1-8.
[34] KUMAAR S,LYU Y,NEX F,et al.Cabinet:Efficient context aggregation network for low-latency semantic segmentation[C]//2021 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2021:13517-13524.
[35] SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training re-gion-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:761-769.
[1] YAN Xin, HUANG Zhiqiu, SHI Fan, XU Heng. Study on Following Car Model with Different Driving Styles Based on Proximal PolicyOptimization Algorithm [J]. Computer Science, 2024, 51(9): 223-232.
[2] HUO Weile, JING Tao, REN Shuang. Review of 3D Object Detection for Autonomous Driving [J]. Computer Science, 2023, 50(7): 107-118.
[3] HU Mingyang, GUO Yan, JIN Yangshuang. PSwin:Edge Detection Algorithm Based on Swin Transformer [J]. Computer Science, 2023, 50(6): 194-199.
[4] BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207.
[5] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[6] ZHENG Shun-yuan, HU Liang-xiao, LYU Xiao-qian, SUN Xin, ZHANG Sheng-ping. Edge Guided Self-correction Skin Detection [J]. Computer Science, 2022, 49(11): 141-147.
[7] ZENG Wei-liang, CHEN Yi-hao, YAO Ruo-yu, LIAO Rui-xiang, SUN Wei-jun. Application of Spatial-Temporal Graph Attention Networks in Trajectory Prediction for Vehicles at Intersections [J]. Computer Science, 2021, 48(6A): 334-341.
[8] SONG Yu, SUN Wen-yun. Edge Detection in Images Corrupted with Noise Based on Improved Nonlinear Structure Tensor [J]. Computer Science, 2021, 48(6): 138-144.
[9] LI Ya-ze, LIU Hong-zhe. Object Detection Based on Neighbour Feature Fusion [J]. Computer Science, 2021, 48(12): 264-268.
[10] ZHU Rong, YE Kuan, YANG Bo, XIE Huan, ZHAO Lei. Feature Classification Method Based on Improved DeeplabV3+ [J]. Computer Science, 2021, 48(11A): 382-385.
[11] LIU Jun-qi, LI Zhi and ZHANG Xue-yang. Candidate Region Detection Method for Maritime Ship Based on Visual Saliency [J]. Computer Science, 2020, 47(6A): 237-241.
[12] ZHOU Yue-yong,CHENG Jiang-hua,LIU Tong,WANG Yang,CHEN Ming-hui. Review of Road Extraction for High-resolution SAR Images [J]. Computer Science, 2020, 47(1): 124-135.
[13] HUO Xing, FEI Zhi-wei, ZHAO Feng, SHAO Kun. Application of Deep Learning in Driver’s Safety Belt Detection [J]. Computer Science, 2019, 46(6A): 182-187.
[14] WANG Ya-ge, KANG Xiao-dong, GUO Jun, HONG Rui, LI Bo, ZHANG Xiu-fang. Image Compression Method Combining Canny Edge Detection and SPIHT [J]. Computer Science, 2019, 46(6A): 222-225.
[15] WANG Zhi-hui, LI Jia-tong, XIE Si-yan, ZHOU Jia, LI Hao-jie, FAN Xin. Two-stage Method for Video Caption Detection and Extraction [J]. Computer Science, 2018, 45(8): 50-53.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!