计算机科学 ›› 2024, Vol. 51 ›› Issue (11): 174-181.doi: 10.11896/jsjkx.231000009
耿焕同1,2,3, 李嘉兴1, 蒋骏1, 刘振宇1, 范子辰4
GENG Huantong1,2,3, LI Jiaxing1, JIANG Jun1, LIU Zhenyu1, FAN Zichen4
摘要: PID(Proportion Integration Differentiation)语义分割架构缓解了双边架构中细节特征容易被周围的上下文信息淹没的问题(超调),同时取得了优越的性能。然而,该架构中高分辨率的边界分支严重影响了推理速度。针对此问题,提出了基于空间注意力机制和轻量辅助语义分支构建的高效PID架构。其中,轻量注意力融合模块用于提取精确的上下文信息并指导不同特征信息的融合,快速聚合金字塔池化模块能够快速聚合多种尺度的语义信息,并设计了一种结合Canny边缘检测算子的深监督训练策略以增强训练效果。与基线相比,所提模型以较小的时延代价换取了6%的精度提升,并且在Cityscapes,CamVid和KITTI数据集上取得了准确性和速度的良好平衡,精度超越了现有同一速度区间的模型。其中,所提模型在Cityscapes测试集上以120.9 frames/s的帧率达到了78.5%的精度。
中图分类号:
[1] FENG D,HAASE-SCHÜTZ C,ROSENBAUM L,et al.Deepmulti-modal object detection and semantic segmentation for autonomous driving:Datasets,methods,and challenges[J].IEEE Transactions on Intelligent Transportation Systems,2020,22(3):1341-1360. [2] ASGARI T S,ABHISHEK K,COHEN J P,et al.Deep semantic segmentation of natural and medical images:a review[J].Artificial Intelligence Review,2021,54:137-178. [3] YUAN X,SHI J,GU L.A review of deep learning methods for semantic segmentation of remote sensing imagery[J].Expert Systems with Applications,2021,169:114417. [4] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440. [5] CHEN Q S,ZHANG Y,PU L,et al.Multi-path Semantic Segmentation Based on Edge Optimization and Global Modeling[J].Computer Science,2023,50(S1):2207137. [6] WANG Y,ZHOU Q,LIU J,et al.Lednet:A lightweight encoder-decoder network for real-time semantic segmentation[C]//2019 IEEE International Conference on Image Processing(ICIP).IEEE,2019:1860-1864. [7] LI X,YOU A,ZHU Z,et al.Semantic flow for fast and accurate scene parsing[C]//Computer Vision-ECCV 2020:16th Euro-pean Conference,Glasgow,UK,August 23-28,2020,Procee-dings,Part I 16.Springer International Publishing,2020:775-793. [8] XU J,XIONG Z,BHATTACHARYYA S P.PIDNet:A Real-Time Semantic Segmentation Network Inspired by PID Controllers[C]//Proceedings of the IEEE/CVF Conference on Compu-ter Vision and Pattern Recognition.2023:19529-19539. [9] BEZDEK J C.A convergence theorem for the fuzzy ISODATAclustering algorithms[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1980(1):1-8. [10] EMARA T,MUNIM H E,ABBAS H M,et al.LiteSeg:A Novel Lightweight ConvNet for Semantic Segmentation[J].arXiv:1912.06683,2019. [11] NIRKIN Y,WOLF L,HASSNER T.Hyperseg:Patchwise hy-pernetwork for real-time semantic segmentation[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4061-4070. [12] RONNEBERGER O,FISCHER P,BROX T.U-net:Convolu-tional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention-MICCAI 2015:18th International Conference,Munich,Germany,October 5-9,2015,Proceedings,Part III 18.Springer Interna-tional Publishing,2015:234-241. [13] FAN M,LAI S,HUANG J,et al.Rethinking bisenet for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:9716-9725. [14] PENG J,LIU Y,TANG S,et al.Pp-liteseg:A superior real-time semantic segmentation model[J].arXiv:2204.02681,2022. [15] YU C,WANG J,PENG C,et al.Bisenet:Bilateral segmentation network for real-time semantic segmentation[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:325-341. [16] YU C,GAO C,WANG J,et al.Bisenet v2:Bilateral networkwith guided aggregation for real-time semantic segmentation[J].International Journal of Computer Vision,2021,129(11):3051-3068. [17] HONG Y D,PAN H H,SUN W C,et al.Deep dual-resolution networks for real-time and accurate semantic segmentation of road scenes[J].arXiv:2101.06085,2021. [18] HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [19] LI H,XIONG P,FAN H,et al.Dfanet:Deep feature aggregation for real-time semantic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:9522-9531. [20] SONG Q,MEI K,HUANG R.AttaNet:Attention-augmentednetwork for fast and accurate scene parsing[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2021:2567-2575. [21] ZHAO H,SHI J,QI X,et al.Pyramid scene parsing network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2881-2890. [22] JOCHER G.YOLOv5 by Ultralytics(Version 7.0)[EB/OL].https://doi.org/10.5281/zenodo.3908559. [23] TAKIKAWA T,ACUNA D,JAMPANI V,et al.Gated-scnn:Gated shape cnns for semantic segmentation[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:5229-5238. [24] CORDTS M,OMRAN M,RAMOS S,et al.The cityscapes dataset for semantic urban scene understanding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:3213-3223. [25] BROSTOW G J,FAUQUEUR J,CIPOLLA R.Semantic object classes in video:A high-definition ground truth database[J].Pattern Recognition Letters,2009,30(2):88-97. [26] ABU ALHAIJA H,MUSTIKOVELA S K,MESCHEDER L,et al.Augmented reality meets computer vision:Efficient data generation for urban driving scenes[J].International Journal of Computer Vision,2018,126:961-972. [27] RUSSAKOVSKY O,DENG J,SU H,et al.Imagenet large scale visual recognition challenge[J].International Journal of Computer Vision,2015,115:211-252. [28] GENG H,JIANG J,SHEN J,et al.Cascading Alignment forUnsupervised Domain-Adaptive DETR with Improved DeNoi-sing Anchor Boxes[J].Sensors,2022,22(24):9629. [29] GU Y H,HAO J,CHEN B.Semi-supervised Semantic Segmentation for High-resolution Remote Sensing Images Based on DataFusion[J].Computer Science,2023,50(S1):22050001-6. [30] CHEN L,XU G,FU N N,et al.Research on 3D Point Cloud Semantic Segmentation Method Fused with Edge Detection[J].Journal of Chongqing Technology and Business University(Na-tural Science Edition),2022,39(5):1-9. [31] ORSIC M,KRESO I,BEVANDIC P,et al.In defense of pre-trained imagenet architectures for real-time semantic segmentation of road-driving images[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:12607-12616. [32] CHEN W,GONG X,LIU X,et al.Fasterseg:Searching for faster real-time semantic segmentation[J].arXiv:1912.10917,2019. [33] WANG Y,CHEN S,BIAN H,et al.Deep Multi-Resolution Net-work for Real-Time Semantic Segmentation in Street Scenes[C]//2023 International Joint Conference on Neural Networks(IJCNN).IEEE,2023:1-8. [34] KUMAAR S,LYU Y,NEX F,et al.Cabinet:Efficient context aggregation network for low-latency semantic segmentation[C]//2021 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2021:13517-13524. [35] SHRIVASTAVA A,GUPTA A,GIRSHICK R.Training re-gion-based object detectors with online hard example mining[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:761-769. |
|