计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 231000106-7.doi: 10.11896/jsjkx.231000106
黄玲娃, 崔文成, 邵虹
HUANG Lingwa, CUI Wencheng, SHAO Hong
摘要: 针对遮挡行人检测识别困难、检测精度低,以及漏检率高等问题,在YOLOv7方法的基础上进行结构优化,提出了一种基于多层特征融合的行人检测网络模型,旨在提高遮挡行人检测的准确性。该方法是在主干网络特征提取部分采用ELAN-C模块,以增强行人特征信息的提取能力,从而提高行人检测的准确性。同时,在多尺度特征融合部分引入全局注意力机制构成多层特征融合,通过跨维度的信息交互,特别是对位置信息的关注,增强检测目标特征的表征,提高行人检测的准确性。此外,为了加速模型的收敛速度,采用EIoU作为损失函数,进一步提升检测框的定位精度。在公开数据集CityPresons上进行训练验证,模型对数平均漏检率MR-2下降,Bare,Partial,Reasonable,Heavy分别下降0.55%,0.91%,1.78%,1.68%,有效减少了漏检率。
中图分类号:
[1]ZOU Y Q,XIAO Z H,TANG X F,et al.Anchor-free scale adaptive pedestrian detection algorithm[J].Control and Decision.2021,36(2):295-302. [2]LI C,KASAEI S,HOSSEIN G Y,et al.Deep Learning for Visual Tracking:A Comprehensive Survey[J].IEEETransactions on IntelligentTransportation Systems.2022,23(5):3943-3968. [3]BI X Y,XU S,WANG Y H.A Review onPedestrian Gait Feature Expression and Recognition[J].Pattern Recognition and Artificial Intelligence.2012,25(1):71-81. [4]LUO Y,ZHAN Z Y,TIAN Y H,et al.An overview of deep learning based pedestrian detection algorithms[J].Journal of Image and Graphics.2022,27(7):2094-2111. [5]GIRSHICK R,DARRELL T,MALIK J,et al.Deformable Part Models are Convolutional Neural Networks[C]//Computer Vision and Pattern Recognition(CVPR).2015:437-446. [6]ZHANG K,XIONG F,HU L,et al.Double Anchor R-CNN for Human Detection in a Crowd[J].Computer Vision and Pattern Recognition,2019,(9):99-98. [7]LU R Q,MA H M,WANG Y.Semantic Head Enhanced Pedestrian Detection in a Crowd[J].Neurocomputing.2020,(400):343-351. [8]LI Q Q,ZHUO H,LI H S,et al.Jointly Learning Deep Fea-tures,Deformable Parts,Occlusion and Classification for Pedestrian Detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence.2018,40(8):1874-1887. [9]HUANG X,GE Z,JIE Z Q,et al.NMS by Representative Region:Towards Crowded Pedestrian Detection by Proposal Pairing[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:10750-10759. [10]CHU X G,ZHENG A L,SUN L,et al.Detection in CrowdedScenes:One Proposal,Multiple Predictions[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:12214-12223. [11]CHU J,SU W,ZHOU Z B,et al.Combing Semantics WithMulti-level Feature Fusion for Pedestrian Detection[J].Acta Automatica Sinica.2022,48(1):282-291. [12]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-Art for real-time object detectors[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2023:7464-7475. [13]ZHANG X D,ZENG H,GUO S,et al.Efficient Long-Range Attention Network forImage Super-resolution[C]//European Conference on Computer Vision(ECCV).2022:649-667. [14]TAN M X,PANG R M,LE Q V.EfficientDet:Scalable and Efficient Object Detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:10781-10790. [15]LIU Y C,SHAO Z R,HOFFMANN N.Global Attention Mechanism:Retain Information toEnhance Channel-SpatialInteractions[J/OL].https://doi.org/10.48550/arXiv.2112.05561. [16]LIU S,QI L,QIN H F,et al.Path Aggregation Network for Instance Segmentation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2018:8759-8768. [17]WANG L,TAN T,ZHANG Y F,et al.Focal and Efficient IOU Loss for Accurate Bounding Box Regression[J].Neurocomputing.2022,(506):146-157. [18]JOCHER G.YOLOv5 by Ultralytics[DB/OL].https://github.com/ultralytics/yolov5. [19]LIU S T,WANG F,LI Z,et al.Yolox:Exceeding Yolo Series in2021[J/OL].https://doi.org/10.48550/arXiv.2107.08430. [20]WANG X X,WANG G Z,DANG Q Q,et al.PP-YOLOE-R:An Efficient Anchor-Free Rotated ObjectDetector[J/OL].https://doi.org/10.48550/arXiv.2211.02386. [21]ZHENG G,LI Z M,KIU S T,et al.OTA:Optimal Transport Assignment for Object Detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:303-312. [22]LEE Y,LEE S,BAE Y,et al.An Energy and GPU-ComputationEfficient Backbone Network forReal-Time Object Detection[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).2019:752-760. [23]BOCHKOVSKIYA,WANG C Y,LIAO H.Scaled-YOLOv4:Scaling Cross StagePartial Network[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:13024-13033. [24]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature PyramidNetworks for Object Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:936-944. [25]ZHANG S S,BENENSON R,SCHIEKE B.Schiele,CityPer-sons:A Diverse Dataset for Pedestrian Detection[C]//Computer Vision and Pattern Recognition(CVPR).2017:4457-4465. [26]WANG X L,XIAO T T,SHAO S,et al.Repulsion Loss:Detecting Pedestrians in a Crowd[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2018:7774-7783. [27]SONG T,SUN L Y,XIE D,et al.Small-scale Pedestrian Detection Based on Somatic Topology Localization and Temporal Feature Aggregation[C]//European Conference on Computer Vision(ECCV).2018:554-569. [28]ZHANG S F,WEN L Y,XIAO B,et al.Occlusion-aware R-CNN:Detecting Pedestrians in a Crowd[C]//European Conference on Computer Vision(ECCV).2018:657-674. [29]LIU W,LIAO S,HU W,et al.Learning Efficient Single-stage Pedestrian Detectors by Asymptotic Localization Fitting[J].IEEE Transactions on Image Processing.2018,(29):1413-1425. [30]LIU W,HASAN L,LIAO S C.Center and Scale Prediction:Anchor-free Approach for Pedestrian and Face Detection[J/OL].https://doi.org/10.48550/arXiv.1904.02948. [31]LIU M Y,ZHU C,JIANG J,et al.VLPD:Context-Aware Pedestrian Detection via Vision-Language Semantic Self-Supervision[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2023:6662-6671. |
|