计算机科学 ›› 2024, Vol. 51 ›› Issue (8): 192-199.doi: 10.11896/jsjkx.230500071
浦斌1, 梁正友1,2, 孙宇1,2
PU Bin1, LIANG Zhengyou1,2, SUN Yu1,2
摘要: 单目3D目标检测旨在通过单目图像完成3D目标检测,现有的单目3D目标检测算法大多基于经典的2D目标检测算法。针对单目3D目标检测算法中通过直接回归的实例深度估计不准,导致检测精度较差的问题,提出了一种基于高深约束与边缘特征融合的单目3D目标检测算法。在实例深度估计方法上采用几何投影关系下的实例3D高度与2D高度计算高深约束,将实例深度的预测转化为对目标的2D高度以及3D高度的预测;针对单目图像存在图像边缘截断目标,采用基于深度可分离卷积的边缘融合模块来加强对边缘目标的特征提取;对于图像中目标的远近造成的目标多尺度问题,设计了基于空洞卷积的多尺度混合注意力模块,增强了对最高层特征图的多尺度特征提取。实验结果表明,所提方法在KITTI数据集上的汽车类别检测精度相比基准模型提升了7.11%,优于当前的方法。
中图分类号:
[1]ZHOU X,WANG D,KRAHENBUHL P.Objects as points[EB/OL].(2019-04-16)[2022-09-24].https://arxiv.org/abs/1904.07850. [2]LIU Z,WU Z,TOTH R.Smoke:Single-stage monocular 3d object detection via keypoint estimation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:996-997. [3]CHEN Y,TAI L,SUN K,et al.Monopair:Monocular 3d object detection using pairwise spatial relationships[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12093-12102. [4]MA X,ZHANG Y,XU D,et al.Delving into localization errors for monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:4721-4730. [5]DING M,HUO Y,YI H,et al.Learning depth-guided convolutions for monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:1000-1001. [6]WANG L,DU L,YE X,et al.Depth-conditioned dynamic message propagation for monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:454-463. [7]ZHOU Z,DU L,YE X,et al.SGM3D:stereo guided monocular3d object detection[J].IEEE Robotics and Automation Letters,2022,7(4):10478-10485. [8]CHEN H,HUANG Y,TIAN W,et al.Monorun:Monocular 3d object detection by reconstruction and uncertainty propagation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:10379-10388. [9]READING C,HARAKEH A,CHAE J,et al.Categorical depth distribution network for monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:8555-8564. [10]HUANG K C,WU T H,SU H T,et al.Monodtr:Monocular 3d object detection with depth-aware transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:4012-4021. [11]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].arXiv:1706.03762,2017. [12]ZHANG Y,LU J,ZHOU J.Objects are different:Flexible monocular 3d object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:3289-3298. [13]LU Y,MA X,YANG L,et al.Geometry uncertainty projection network for monocular 3d object detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:3111-3121. [14]KUMAR A,BRAZIL G,CORONA E,et al.Deviant:Depthequivariant network for monocular 3d object detection[C]//Computer Vision-ECCV 2022:17th European Conference,Tel Aviv,Israel,Part IX.Cham:Springer Nature Switzerland,2022:664-683. [15]HE K,GKIOXARI G,DOLLAR P,et al.Mask r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969. [16]YU F,WANG D,SHELHAMER E,et al.Deep layer aggregation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2403-2412.. [17]KENDALL A,GAL Y,CIPOLLA R.Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7482-7491. [18]WU Y,HE K.Group normalization[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19. [19]WANG Q,WU B,ZHU P,et al.ECA-Net:Efficient channel attention for deep convolutional neural networks[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:11534-11542. [20]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778. [21]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988. [22]MOUSAVIAN A,ANGUELOV D,FLYNN J,et al.3d boun-ding box estimation using deep learning and geometry[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7074-7082. [23]GEIGER A,LENZ P,STILLER C,et al.Vision meets robotics:The kitti dataset[J].The International Journal of Robotics Research,2013,32(11):1231-1237. [24]LIAN Q,YE B,XU R,et al.Exploring Geometric Consistency for Monocular 3D Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:1685-1694. [25]FU H,GONG M,WANG C,et al.Deep ordinal regression network for monocular depth estimation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:2002-2011. |
|