计算机科学 ›› 2025, Vol. 52 ›› Issue (11): 184-195.doi: 10.11896/jsjkx.241100107

• 计算机图形学&多媒体 • 上一篇    下一篇

基于深度特征强化与路径聚合优化的目标检测

王晓峰, 黄俊俊, 谭文雅, 沈紫璇   

  1. 武汉科技大学计算机科学与技术学院 武汉 430070
    武汉科技大学智能信息处理与实时工业系统湖北省重点实验室 武汉 430070
  • 收稿日期:2024-11-18 修回日期:2025-02-16 出版日期:2025-11-15 发布日期:2025-11-06
  • 通讯作者: 黄俊俊(3514469387@qq.com)
  • 作者简介:(wangxiaofeng@wust.edu.cn)
  • 基金资助:
    国家自然科学基金(62302351);湖北省自然科学基金(2022CFB018)

Object Detection Based on Deep Feature Enhancement and Path Aggregation Optimization

WANG Xiaofeng, HUANG Junjun, TAN Wenya, SHEN Zixuan   

  1. School of Computer Science and Technology,Wuhan University of Science and Technology,Wuhan 430070,China
    Hubei Provincial Key Laboratory of Intelligent Information Processing and Real-time Industrial System,Wuhan University of Science and Technology,Wuhan 430070,China
  • Received:2024-11-18 Revised:2025-02-16 Online:2025-11-15 Published:2025-11-06
  • About author:WANG Xiaofeng,born in 1978,Ph.D,professor,is a member of CCF(No.A8319M).His main research interests include object detection and image super resolution.
    HUANG Junjun,born in 1998,postgra-duate.His main research interests include object detection and image super resolution.
  • Supported by:
    National Natural Science Foundation of China(62302351) and Natural Science Foundation of Hubei Province(2022CFB018).

摘要: 在深度网络的前馈过程中,输入数据的特征信息会被抽象和压缩,导致部分对于目标检测关键的特征信息被弱化。基于YOLOv11n,提出了深度特征强化与路径聚合优化的目标检测方法。首先,设计全局-局部特征增强模块GLFEM(Global-Local Feature Enhancement Module),结合特征图局部特征与全局特征,强化深层网络特征的表达能力。然后,设计自适应特征增强模块AFEM(Adaptive Feature Enhancement Module),根据特征的可靠性动态增强深层网络的特征提取能力。最后,对路径聚合特征金字塔网络进行优化,融合了不同层次之间的特征信息,减少了层次之间的语义信息差。在VisDrone,NWPU VHR-10和TinyPerson这3个公共数据集上的实验结果表明,该方法的平均检测精度相较于当前先进的目标检测器均有所提升。在自建数据集AirportTiny上进行实验,该方法同样取得了不错的效果,具有良好的泛化能力。

关键词: 目标检测, 深层网络, 路径聚合, 特征信息, 特征强化

Abstract: In deep networks,the feature information of the input data is gradually abstracted and compressed during the feed-forward process,resulting in some of the feature information that is crucial for object detection being diluted or lost.Based on YOLOv11n,an object detection method with deep feature enhancement and path aggregation optimization is proposed. Firstly,GLFEM is designed to combine the local features of the feature map with the global features to strengthen the expression ability of the deep network features.Then,AFEM is designed to dynamically enhance the feature extraction ability of the deep network according to the reliability of the features. Finally,the path aggregation feature pyramid network is optimized to fuse the feature information between different levels and reduce the semantic information difference between levels.Experimental results on three public datasets,VisDrone,NWPU VHR-10,and TinyPerson,show that the average detection accuracy of the proposed method is improved compared to current state-of-the-art object detectors.Experiments on the self-built dataset AirportTiny also show the proposed method achieves good performance,it has good generalisation ability.

Key words: Object detection, Deep network, Path aggregation, Feature information, Feature enhancement

中图分类号: 

  • TP391.4
[1]CHEN C,QI J,LIU X,et al.Weakly Misalignment-free Adap-tive Feature Alignment for UAVs-based Multimodal Object Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:26836-26845.
[2]SOBEK J,MEDINA INOJOSA J R,MEDINA INOJOSA B J,et al.MedYOLO:A Medical Image Object Detection Framework [J].Journal of Imaging Informatics in Medicine,2024,37:3208-3216.
[3]WANG Q,LIU F,ZOU R,et al.Enhancing medical image object detection with collaborative multi-agent deep Q-networks and multi-scale representation [J].EURASIP Journal on Advances in Signal Processing,2023,2023(1):132.
[4]XU Q,LIN X,CAI M,et al.End-to-End Joint Multi-Object De-tection and Tracking for Intelligent Transportation Systems [J].Chinese Journal of Mechanical Engineering,2023,36(1):138.
[5]ZHAO R,TANG S,SUPENI E E B,et al.A Review of Object Detection in Traffic Scenes Based on Deep Learning [J].Applied Mathematics and Nonlinear Sciences,2023,9(1):1-25.
[6]SARACENI L,MOTOI I M,NARDI D,et al.AgriSORT:ASimple Online Real-time Tracking-by-Detection framework for robotics in precision agriculture[C]//Proceedings of the 2024 IEEE International Conference on Robotics and Automation(ICRA).IEEE,2024:2675-2682.
[7]ZHAO P,ZHOU W,NA L.High-precision object detection network for automate pear picking [J].Scientific Reports,2024,14(1):14965.
[8]MUJKIC E,CHRISTIANSEN M P,RAVN O.Object Detection for Agricultural Vehicles:Ensemble Method Based on Hierarchy of Classes [J].Sensors,2023,23(16):7285.
[9]ZHAO Y,LYU W,XU S,et al.Detrs beat yolos on real-time object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:16965-16974.
[10]XIAO J S,ZHAO T,ZHOU J,et al.Small Target DetectionNetwork Based on Context Augmentation and Feature Refinement[J].Journal of Computer Research and Development,2023,60(2):465-474.
[11]JIANG Z T,ZHAI F S,QIAN Y,et al.Low Illumination Object Detection Combined with Feature Enhancement and Multi-Scale Receptive Field[J].Journal of Computer Research and Development,2023,60(4):903-915.
[12]ZHANG K H,SHEN H K.Solder joint defect detection in the connectors using improved Faster- RCNN algorithm[J].Applied Sciences,2021,11(2):576.
[13]YANG A M,JIANG T Y,HAN Y,et al.Research on application of on-line melting in-SITU visual inspection of iron ore powder based on Faster R-CNN[J].Alexandria Engineering Journal,2022,61(11):8963-8971.
[14]KUMAR A,MANIKANDAN R.Brain tumor detection usingdeep neural network- based classifier[C]//Proceedings of the 2022 International Conference on Innovative Computing and Communications.Singapore:Springer,2022:173-181.
[15]TERVEN J,CÓRDOVA-ESPARZA D M,ROMERO-GONZÁ-LEZ J A.A comprehensive review of yolo architectures in computer vision:From yolov1 to yolov8 and yolo-nas [J].Machine Learning and Knowledge Extraction,2023,5(4):1680-1716.
[16]SAPKOTA R,MENG Z C,CHURUVIJA M,et al.Comprehensive Performance Evaluation of YOLO11,YOLOv10,YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments[J].arXiv:2407.12040,2024.
[17]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2023:7464-7475.
[18]HE Z,WANG K,FANG T,et al.Comprehensive Performance Evaluation of YOLOv11,YOLOv10,YOLOv9,YOLOv8 and YOLOv5 on Object Detection of Power Equipment [J].arXiv:2411.18871,2024.
[19]ZHANG Y,XIA Y.Object Detection Method with Multi-scaleFeature Fusion for R-emote Sensing Images[J].Computer Science,2024,51(3):165-173.
[20]TERVEN J,CÓRDOVA-ESPARZA D M,ROMERO-GONZÁ-LEZ J A.A comprehensive review of yolo architectures in computer vision:From YOLOv1 to YOLOv8 and YOLO-nas [J].Machine Learning and Knowledge Extraction,2023,5(4):1680-1716.
[21]WANG C Y,YEH I H,LIAO H Y.Yolov9:Learning what you want to learn using programmable gradient information [C]//Proceedings of the European Conference on Computer Vision.Cham:Springer.2025:1-21.
[22]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Proceedings of the Computer Vision-ECCV 2016:14th European Conference.Springer.2016:21-37.
[23]QUE Y,GAN M H,LIU Z W.Object Detection with Receptive Field Expansion and Multi-branch Aggregation[EB/OL].https://doi.org/10.11896/jsjkx.230600151.
[24]LI Y C,ZHANG R,WANG J B,et al.Re-parameter-ization Enhanced Dual-modal Realtime Object Detection Model[J].Computer Science,2024,51(9):162-172.
[25]WANG J,CHEN Y,DONG Z,et al.Improved YOLOv5 network for real-time multi-scale traffic sign detection [J].Neural Computing and Applications,2023,35(10):7853-7865.
[26]NI J,ZHU S,TANG G,et al.A small-object detection modelbased on improved YOLOv8s for UAV image scenarios [J].Remote Sensing,2024,16(13):2465.
[27]WEI J,NI L,LUO L,et al.GFS-YOLO11:A Maturity Detection Model for Multi-Variety Tomato [J].Agronomy,2024,14(11):2644.
[28]JOOSHIN H K,NANGIR M,SEYEDARABI H.Inception-YOLO:Computational cost and accuracy improvement of the YOLOv5 model based on employing modified CSP,SPPF,and inception modules [J].IET Image Processing,2024,18(8):1985-1999.
[29]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[30]CAGNETTA F,PETRINI L,TOMASINI U M,et al.How deep neural networks learn compositional data:The random hierarchy model [J].Physical Review X,2024,14(3):031001.
[31]LIU M,WANG H,DU L,et al.Bearing-detr:A lightweightdeep learning model for bearing defect detection based on RT-DETR [J].Sensors,2024,24(13):4262.
[32]LA MALFA E,LA MALFA G,NICOSIA G,et al.Characterizing learning dynamics of deep neural networks via complex networks[C]//Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence(ICTAI).IEEE,2021:344-351.
[33]LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection [C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[34]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8759-8768.
[35]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790.
[36]QIU Y F,XIN H.Target Detection Algorithm Based on Global Feature Fusion in Parallel Dual Path Backbone[J].Journal of Frontiers of Computer Science and Technology,2024,18(12):3247-3259.
[37]SHI Y,WANG L,YAO Y P,et al.Small Object Detection Based on Enhanced Feature Pyramid and Focal-AIoU Loss[J].Journal of Frontiers of Computer Science and Technology,2025,19(3):693-702.
[38]HAN B,HE L,KE J,et al.Weighted parallel decoupled feature pyramid network for object detection [J].Neurocomputing,2024,593:127809.
[39]PENG Z,HUANG W,GU S,et al.Conformer:Local featurescoupling global representations for visual recognition[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:367-376.
[40]GEIGER B C,KUBIN G.Information bottleneck:Theory and applications in deep learning [J].Entropy,2020,22(12):1408.
[41]LIU Z,WANG B,LI Y,et al.UnitModule:A lightweight joint image enhancement module for underwater object detection [J].Pattern Recognition,2024,151:110435.
[42]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely con-nected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[43]NING Q,DONG W,LI X,et al.Uncertainty-driven loss for sin-gle image super-resolution [J].Advances in Neural Information Processing Systems,2021,34:16398-16409.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!