基于YOLOP-L的多特征融合道路全景驾驶检测

doi:10.11896/jsjkx.230700185

摘要/Abstract

摘要： 目前,驾驶员视角下的交通图像检测技术成为交通领域的重要研究方向,同时提取车辆、道路、交通标志等多种特征已经成为驾驶员理解道路信息多样性的亟需任务。以往研究已在单类目标检测的特征提取方面取得了长足进步,然而,这些研究不能很好地联合应用于其他区别较大的特征检测任务中,且融合训练过程中会损失个别特征检测的精度。针对驾驶员视野范围内道路信息多样且复杂的特点,本文提出了一种基于多特征融合训练的检测模型YOLOP-L,它能够同时对多种不同特征交通目标进行融合训练,同时保证单项检测任务的精度。首先,为了解决特征融合中语义信息表达不完整的问题,设计的SP-LNet模块通过FPN与双向特征网络结合实现网络更深层次的融合,使得提取的信息更完整,从而提升道路小目标的检测性能;其次,设计新的分割头深度可分离卷积,将语义信息与局部特征融合促使多特征融合的训练准确度与速度得到进一步提升;再次,体系中设计的GDL-Focal多类混合损失函数更专注于困难样本,可用于解决样本特征不平衡的问题。最后,对比实验表明:YOLOP-L相比原YOLOP网络运行的速度更快;在车辆目标检测任务下召回率提升了2.2%;在车道线检测任务下准确率提升2.8%,车道线IoU的值较HybridNets网络下降2.45%,但较YOLOP-L网络提升1.95%;在可行驶区域分割任务下其整体检测性能提升1.1%。结果表明,在具有挑战性的BDD100K数据集上,YOLOP-L可以在复杂场景下有效解决检测精度不足和分割缺失的问题,提高了车辆识别、车道线检测以及道路行驶区域联合训练的准确性和鲁棒性。

关键词: 全景驾驶, 多特征融合, 车辆检测, 可行驶区域检测, 车道线检测, 双向特征金字塔

Abstract: In recent years,traffic image detection technology from the driver’s perspective has become an important research direction in the field of transportation,and extracting various features such as vehicles,roads,and traffic signs has become an urgent task for drivers to understand the diversity of road information.Previous studies have made significant progress in feature extraction for single class object detection.However,these studies cannot be well applied to other feature detection with significant differences,and the accuracy of individual feature detection will be lost during fusion training.In response to the diverse and complex road information within the driver’s field of view,this paper proposes a detection model YOLOP-L based on multi feature fusion training.It can simultaneously fuse and train multiple different feature traffic targets,while ensuring the accuracy of individual detection tasks.The results indicate that YOLOP-L can effectively solve the problems of insufficient detection accuracy and missing segmentation in complex scenes on the challenging BDD100K dataset,improving the accuracy and robustness of vehicle recognition,lane line detection,and joint training of road driving areas.Finally,comparative experiments show that YOLOP-L runs faster than the original YOLOP network.The recall rate increases by 2.2% under the vehicle target detection task.In the lane detection task,the accuracy improves by 2.8%,and the IoU value of the lane line decreases by 2.45% compared to the HybridNets network,but increases by 1.95% compared to the YOLOP-L network.Its overall detection performance improves by 1.1% under the task of driving area segmentation.The results indicate that YOLOP-L can effectively solve the problems of insufficient detection accuracy and missing segmentation in complex scenes on the challenging BDD100K dataset,improving the accuracy and robustness of vehicle recognition,lane line detection,and joint training of road driving areas.

Key words: Panoramic driving, Multi featurefusion, Vehicle inspection, Travelable area detection, Lane line detection, Bidirectional feature pyramid network

中图分类号:

TP391

吕嘉璐, 周力, 巨永锋. 基于YOLOP-L的多特征融合道路全景驾驶检测[J]. 计算机科学, 2024, 51(6A): 230700185-8. https://doi.org/10.11896/jsjkx.230700185

LYU Jialu, ZHOU Li, JU Yongfeng. Multi Feature Fusion for Road Panoramic Driving Detection Based on YOLOP-L[J]. Computer Science, 2024, 51(6A): 230700185-8. https://doi.org/10.11896/jsjkx.230700185

参考文献

[1]ZOU Q,JIANG H,DAI Q,et al.Robust Lane Detection From Continuous Driving Scenes Using Deep Neural Networks[J].IEEE Transactions on Vehicular Technology,2020,69(1):41-54.
[2]SEO Y W,RAJKUMAR R R.Detection and tracking of boundary of unmarked roads[C]//International Conference on Information Fusion.IEEE,2014.
[3]DONG-SI T C,GUO D,YAN C H,et al.Robust extraction of shady roads for vision-based UGV navigation[C]//2008 IEEE/RSJ International Conference on Intelligent Robots and Systems,Acropolis Convention Center,Nice,France.IEEE,2008.
[4]KALAKI A S,SAFABAKHSH R.Current and adjacent lanesdetection for an autonomous vehicle to facilitate obstacle avoidance using a monocular camera[C]//2014 Iranian Conference on Intelligent Systems(ICIS).IEEE,2014:1-6.
[5]YU L,TAN H,BANSAL M,et al.A Joint Speaker-Listener-Reinforcer Model for Referring Expressions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:3521-3529.
[6]WU F,XU Z,YANG Y,et al.An End-to-End Approach to Natural Language Object Retrieval via Context-Aware Deep Reinforcement Learning[J].arXiv:1703.07579,2017.
[7]DUAN K,XIE L,QI H,et al.Corner Proposal Network for Anchor-free,Two-stage Object Detection[C]//European Conference on Computer Vision.2020.
[8]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.2017:1137-1149.
[9]ROSS G.Fast R-CNN[C]//2015 IEEE International Confe-rence on Computer Vision(ICCV).2015:1440-1448.
[10]HE K,ZHANG X,REN S,et al.Spatial Pyramid Pooling inDeep Convolutional Networks for Visual Recognition[C]//IEEE Transactions on Pattern Analysis and Machine Intelligence.2015:1904-1916.
[11]LIU W,ANGUELOV D,ERHAN D.et al.SSD:Single ShotMultiBox Detector[C]//European Conference on Computer Vision.2015.
[12]MA J,SHAO W,YE H,et al.Arbitrary-Oriented Scene Text Detection via Rotation Proposals[C]//IEEE Transactions on Multimedia.2018:3111-3122.
[13]REDMON J,DIVVALA S,GIRSHICKET R,et al.You Only Look Once:Unified,Real-Time Object Detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015:779-788.
[14]TAN M X,PANG R,LE Q V,et al.EfficientDet:Scalable and Efficient Object Detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:10778-10787.
[15]PAN X G,et al.Spatial As Deep:Spatial CNN for Traffic Scene Understanding[C]//AAAI Conference on Artificial Intelligence.2017.
[16]ZHENG T,FANG H,ZHANG Y,et al.RESA:Recurrent Feature-Shift Aggregator for Lane Detection[C]//AAAI Conference on Artificial Intelligence.2020.
[17]LI X,LI J,HU X,et al.Line-CNN:End-to-End Traffic Line Detection With Line Proposal Unit[C]//IEEE Transactions on Intelligent Transportation Systems.2020:248-258.
[18]TABELINI L,BERRIEL R,PAIXAO T M,et al.Keep your Eyes on the Lane:Real-time Attention-guided Lane Detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:294-302.
[19]TABELINI L,BERRIEL R,PAIXAO T M,et al.PolyLaneNet:Lane Estimation via Deep Polynomial Regression[C]//2020 25th International Conference on Pattern Recognition(ICPR).2020:6150-6156.
[20]FENG Z Y,GUO S,TAN X,et al.Rethinking Efficient LaneDetection via Curve Modeling[C]//2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2022:17041-17049.
[21]DONG-SI T C,GUO D,YAN C H,et al Robust extraction of shady roads for vision-based UGV navigation[C]//2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.Acropolis Convention Center,Nice,France.IEEE,2008.
[22]MA B,LAKSHMANAN S,HERO A,et al.Simultaneous detection of lane and pavementboundaries using model-based multisensor fusion[J].IEEE Trans.Intell.Transp.Syst.,2000:135-147.
[23]ZHOU L,FANG J,JU Y,et al.Multi-Saliency Detection via Instance Specific Element Homology[C]//2017 International Conference on Digital Image Computing:Techniques and Applications(DICTA).Sydney,NSW,Australia,2017:1-8.
[24]XU Z H,LIU Y,GAN L,et al.RNGDet:Road Network Graph Detection by Transformer in Aerial Images[C]//IEEE Transactions on Geoscience and Remote Sensing.2022.
[25]WU D S,LIAO M,ZHANG W,et al.YOLOP:You Only Look Once for Panoptic Driving Perception[C]//Machine Intelligence Research.2021:550-562.
[26]VU D,NGO B,PHAN H N,et al.HybridNets:End-to-End Perception Network[J].arXiv:2203.09035,2022.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed