计算机科学 ›› 2025, Vol. 52 ›› Issue (7): 189-200.doi: 10.11896/jsjkx.250100108

• 计算机图形学&多媒体 • 上一篇    下一篇

基于YOLOv8增强的目标检测算法及其应用规范

徐永伟1, 任好盼2, 王棚飞3   

  1. 1 中国政法大学刑事司法学院 北京 100088
    2 北京理工大学计算机学院 北京 100081
    3 应急管理部大数据中心 北京 100013
  • 收稿日期:2025-01-16 修回日期:2025-04-07 发布日期:2025-07-17
  • 通讯作者: 王棚飞(feipengwang767@163.com)
  • 作者简介:(xuyongweistudy@126.com)
  • 基金资助:
    国家社会科学基金(24FFXB068)

Object Detection Algorithm Based on YOLOv8 Enhancement and Its Application Norms

XU Yongwei1, REN Haopan2, WANG Pengfei3   

  1. 1 School of Criminal Justice, China University of Political Science and Law, Beijing 100088, China
    2 School of Computer Science and Technology, Beijing Institute of Technology, Beijing 100081, China
    3 Big Data Center Ministry of Emergency Management, Beijing 100013, China
  • Received:2025-01-16 Revised:2025-04-07 Published:2025-07-17
  • About author:XU Yongwei,born in 1992,Ph.D,assistant professor,master supervisor,is a member of CCF(No.V3750M).His main research interests include artificial intelligence and law,and digital law.
    WANG Pengfei,born in 1988,master,engineer.His main research interests include image processing,artificial intelligence and emergency command.
  • Supported by:
    National Social Science Fundation of China(24FFXB068).

摘要: 目标检测是计算机视觉领域的关键技术之一,旨在从图像或视频中定位目标位置并识别所属的类别,被广泛应用于智能交通、安防监控、工业检测等领域。YOLOv8目标检测方法在检测精度和实时性方面取得了优异的结果,但是在复杂背景干扰、小目标检测、遮挡等方面面临严峻挑战,容易出现误检或漏检的情况。为了提高目标检测的精度,提出了一种基于YOLOv8增强的目标检测算法,并探讨了相应的应用规范。在技术层面,首先,在主干网络中引入空间注意力机制,增强了模型对关键目标的特征提取能力;同时,设计了自适应特征融合模块,提高了模型对多尺度特征图的整合能力。其次,采用了数据增强技术和迁移学习策略,有效地缓解了数据集中样本不平衡和目标数量限制的问题。然后,通过边框回归损失和分类损失的动态权重调整机制,进一步提高了模型的预测精度。最后,通过COCO,PASCAL VOC,Cityscapes,KITTI,VisDrone这5个数据集的大量实验证明了所提方法在检测精度和运行速度方面比最新方法更加准确高效,特别是在复杂场景、小目标检测和遮挡的情况下,模型的鲁棒性和准确性显著提升。在应用规范层面,为应对大规模目标检测算法应用产生的个人图像隐私数据安全的风险,在法律、伦理、技术等方面提出完善的应用规范,以推动技术进步紧密贴合社会发展需求。

关键词: YOLOv8, 目标检测, 空间注意力, 自适应特征融合, 复杂场景, 应用规范

Abstract: Object detection is one of the pivotal technologies within the field of computer vision.Its objective is to pinpoint the locations of objects and recognize their affiliated classes within images or videos,finding extensive applications in domains like intelligent transportation,security monitoring,and industrial inspection.The YOLOv8 object detection approach has attained remarka-ble achievements in both detection precision and real-time responsiveness.Nevertheless,it encounters formidable challenges when dealing with complex background interferences,small object detection,and occlusions,often resulting in false positives or missed detections.To augment the accuracy of object detection,an object detection algorithm based on YOLOv8 enhancement is proposed,and the corresponding application specification are discussed.On the technical front,a spatial attention mechanism is incorporated into the backbone network,bolstering the feature extraction capabilities for key objects.Secondly,an adaptive feature fusion module is devised to enhance the integration proficiency of multi-scale feature maps.Subsequently,data augmentation techniques and transfer learning strategies are employed to effectively tackle the problems of sample imbalance and restricted object quantities in the dataset.Then,via a dynamic weight adjustment mechanism for bounding box regression loss and classification loss,the predictive accuracy is further elevated.Ultimately,extensive experiments conducted on five datasets,namely COCO,PASCAL VOC,Cityscapes,KITTI and VisDrone,validate that the proposed method outperforms other SOTA methods in terms of detection accuracy and operational speed.Notably in complex scenarios,small object detection,and occlusion circumstances,the robustness and accuracy of the model are conspicuously boosted.At the application specification level,with the aim of mitigating the security risks to personal image privacy data arising from the application of large-scale object detection algorithms,it is imperative to formulate comprehensive application norms in aspects such as law,ethics,and technology,so as to promote the progress of technology to closely align with the needs of social development.

Key words: YOLOv8, Object detection, Spatial attention, Adaptive feature fusion, Complex scenes, Applicationnorms

中图分类号: 

  • TP391
[1]ZOU Z,SHI Z,GUO Y,et al.Object detection in 20 years:A survey[J].arXiv:1905.05055,2019.
[2]LIU L,OUYANG W,WANG X,et al.Deep learning for generic object detection:A survey[J].International Journal of Compu-ter Vision,2020,128(2):261-318.
[3]SJØBERG S.Science and technology education:Current challenges and possible solutions[J].Innovations in science and technology education,2002,8:296-307.
[4]JOCHER G.v6.0 - YOLOv5n 'Nano' models,Roboflow integration,TensorFlow export,OpenCV DNN support[R].Zenodo,2021.
[5]LIU S,LI L,NI B,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision.2018:3-19.
[6]LU B,SUN Y,YANG Z,et al.HRNet: 3D object detection network for point cloud with hierarchical refinement[J].Pattern Recognition,2024,149:110254.
[7]THOKE A,RAI S.Exploring Faster R-CNN Algorithms forObject Detection[C]//2024 First International Conference on Software,Systems and Information Technology(SSITCON).IEEE,2024:1-5.
[8]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[9]HE K,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969.
[10]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shotmultibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
[11]JEGHAM N,KOH C Y,ABDELATTI M,et al.Evaluating the Evolution of YOLO(You Only Look Once) Models:A Comprehensive Benchmark Study of YOLO11 and Its Predecessors[J].arXiv:2411.00201,2024.
[12]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988.
[13]WANG C Y,YEH I H,MARK LIAO H Y.Yolov9:Learning what you want to learn using programmable gradient information[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2024:1-21.
[14]HUSSAIN M.Yolov5,yolov8 and yolov10:The go-to detectors for real-time vision[J].arXiv:2407.02988,2024.
[15]LIU C,WU Y,LIU J,et al.Improved YOLOv3 network for insulator detection in aerial images with diverse background interference[J].Electronics,2021,10(7):771.
[16]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[17]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision.2018:3-19.
[18]WANG X,GIRSHICK R,GUPTA A,et al.Non-local neural networks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803.
[19]NYANDWI J.The Transformer Blueprint:A Holistic Guide to the Transformer Neural Network Architecture[J/OL].https://deeprevision.github.io/posts/001-transformer/.
[20]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer,2020:213-229.
[21]LI H,ZHANG R,PAN Y,et al.LR-FPN:Enhancing remote sensing object detection with location refined feature pyramid network[J].arXiv:2404.01614,2024.
[22]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8759-8768.
[23]GHIASI G,LIN T Y,LE Q V.NAS-FPN:Learning scalablefeature pyramid architecture for object detection[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7036-7045.
[24]MEKHALFI M L,NICOLÒ C,BAZI Y,et al.Contrasting YOLOv5,transformer,and EfficientDet detectors for crop circle detection in desert[J].IEEE Geoscience and Remote Sensing Letters,2021,19:1-5.
[25]LIU S,HUANG D,WANG Y.Learning spatial fusion for single-shot object detection[J].arXiv:1911.09516,2019.
[26]LI S H.Legislative Reaction to Privacy Protection in the Digital Era[J].Law Science,2024(3):17-31.
[27]SUN R,FA Y W,FENG H D,et al.Research progress on face presentation attack detection method based on deep learning[J].Computer Science,2025,52(2):323-335.
[28]XIE Y J,YANG Y X,LIU T.Legal Risk Regulation of Personalized Recommendation Algorithms[J].Journal of University of Science and Technology Beijing(Social Sciences Edition),2024,40(1):77-85.
[29]JIANG J,ZHANG Q,WANG C Y.A review of iris recognition based on deep learning[J].Computer Science and Exploration,2024,18(6):1421-1437.
[30]CHEN Z B,ZHANG L H.Legal Regulation of Algorithm Technology:Governance Dilemma,Development Logic,and Optimization Path[J].China Journal of Applied Jurisprudence,2024(4):155-166.
[31]CHEN H Y,MAO L H.An improved lightweight aerial target detection model based on YOLOv5s[J].Computer Science,2024,51(S2):475-482.
[32]ZHENG Z,WANG P,LIU W,et al.Distance-IoU loss:Fasterand better learning for bounding box regression[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020:12993-13000.
[33]KENDALL A,GAL Y,CIPOLLA R.Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7482-7491.
[34]SHARMA D K.Information measure computation and its impact in mi coco dataset[C]//2021 7th International Conference on Advanced Computing and Communication Systems(ICACCS).IEEE,2021:1964-1969.
[35]TONG K,WU Y.Rethinking PASCAL-VOC and MS-COCOdataset for small object detection[J].Journal of Visual Communication and Image Representation,2023,93:103830.
[36]CHENG T,SONG L,GE Y,et al.Yolo-world:Real-time open-vocabulary object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:16901-16911.
[37]HOU X,LIU M,ZHANG S,et al.Salience DETR:EnhancingDetection Transformer with Hierarchical Salience Filtering Refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:17574-17583.
[38]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790.
[39]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[40]XIA Q F.Reflection and Improvement of Personal Information Anonymization System[J].Law and Economy,2024(5):41-58.
[41]ZHENG X J.On the Constitutional Normative Logic of Data Rights Protection[J].Human Rights,2024(6):68-89.
[42]DAI X.Will No-fault Liability Rule for AI Harms Obstruct AI Advancement?Insights from Law and Economics[J].ECUPL Journal,2024,27(5):38-55.
[43]YANG Z H.Another Possibility to Realize the AlgorithmTransparency:Explainable Artificial Intelligence[J].Administrative Law Review,2024(3):154-163.
[44]CHEN Y,ZHANG H.Triple Security Challenges and Legal Regulations of Cross-Border Data Circulation in China-Based on the Analysis of Data Circulation Theory[J].Forum on Science and Technology in China,2024(8):64-73.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!