计算机科学 ›› 2025, Vol. 52 ›› Issue (7): 189-200.doi: 10.11896/jsjkx.250100108
徐永伟1, 任好盼2, 王棚飞3
XU Yongwei1, REN Haopan2, WANG Pengfei3
摘要: 目标检测是计算机视觉领域的关键技术之一,旨在从图像或视频中定位目标位置并识别所属的类别,被广泛应用于智能交通、安防监控、工业检测等领域。YOLOv8目标检测方法在检测精度和实时性方面取得了优异的结果,但是在复杂背景干扰、小目标检测、遮挡等方面面临严峻挑战,容易出现误检或漏检的情况。为了提高目标检测的精度,提出了一种基于YOLOv8增强的目标检测算法,并探讨了相应的应用规范。在技术层面,首先,在主干网络中引入空间注意力机制,增强了模型对关键目标的特征提取能力;同时,设计了自适应特征融合模块,提高了模型对多尺度特征图的整合能力。其次,采用了数据增强技术和迁移学习策略,有效地缓解了数据集中样本不平衡和目标数量限制的问题。然后,通过边框回归损失和分类损失的动态权重调整机制,进一步提高了模型的预测精度。最后,通过COCO,PASCAL VOC,Cityscapes,KITTI,VisDrone这5个数据集的大量实验证明了所提方法在检测精度和运行速度方面比最新方法更加准确高效,特别是在复杂场景、小目标检测和遮挡的情况下,模型的鲁棒性和准确性显著提升。在应用规范层面,为应对大规模目标检测算法应用产生的个人图像隐私数据安全的风险,在法律、伦理、技术等方面提出完善的应用规范,以推动技术进步紧密贴合社会发展需求。
中图分类号:
[1]ZOU Z,SHI Z,GUO Y,et al.Object detection in 20 years:A survey[J].arXiv:1905.05055,2019. [2]LIU L,OUYANG W,WANG X,et al.Deep learning for generic object detection:A survey[J].International Journal of Compu-ter Vision,2020,128(2):261-318. [3]SJØBERG S.Science and technology education:Current challenges and possible solutions[J].Innovations in science and technology education,2002,8:296-307. [4]JOCHER G.v6.0 - YOLOv5n 'Nano' models,Roboflow integration,TensorFlow export,OpenCV DNN support[R].Zenodo,2021. [5]LIU S,LI L,NI B,et al.CBAM:Convolutional Block Attention Module[C]//Proceedings of the European Conference on Computer Vision.2018:3-19. [6]LU B,SUN Y,YANG Z,et al.HRNet: 3D object detection network for point cloud with hierarchical refinement[J].Pattern Recognition,2024,149:110254. [7]THOKE A,RAI S.Exploring Faster R-CNN Algorithms forObject Detection[C]//2024 First International Conference on Software,Systems and Information Technology(SSITCON).IEEE,2024:1-5. [8]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [9]HE K,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2961-2969. [10]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shotmultibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37. [11]JEGHAM N,KOH C Y,ABDELATTI M,et al.Evaluating the Evolution of YOLO(You Only Look Once) Models:A Comprehensive Benchmark Study of YOLO11 and Its Predecessors[J].arXiv:2411.00201,2024. [12]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:2980-2988. [13]WANG C Y,YEH I H,MARK LIAO H Y.Yolov9:Learning what you want to learn using programmable gradient information[C]//European Conference on Computer Vision.Cham:Springer Nature Switzerland,2024:1-21. [14]HUSSAIN M.Yolov5,yolov8 and yolov10:The go-to detectors for real-time vision[J].arXiv:2407.02988,2024. [15]LIU C,WU Y,LIU J,et al.Improved YOLOv3 network for insulator detection in aerial images with diverse background interference[J].Electronics,2021,10(7):771. [16]HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141. [17]WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision.2018:3-19. [18]WANG X,GIRSHICK R,GUPTA A,et al.Non-local neural networks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803. [19]NYANDWI J.The Transformer Blueprint:A Holistic Guide to the Transformer Neural Network Architecture[J/OL].https://deeprevision.github.io/posts/001-transformer/. [20]CARION N,MASSA F,SYNNAEVE G,et al.End-to-end object detection with transformers[C]//European Conference on Computer Vision.Cham:Springer,2020:213-229. [21]LI H,ZHANG R,PAN Y,et al.LR-FPN:Enhancing remote sensing object detection with location refined feature pyramid network[J].arXiv:2404.01614,2024. [22]LIU S,QI L,QIN H,et al.Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:8759-8768. [23]GHIASI G,LIN T Y,LE Q V.NAS-FPN:Learning scalablefeature pyramid architecture for object detection[C]//Procee-dings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:7036-7045. [24]MEKHALFI M L,NICOLÒ C,BAZI Y,et al.Contrasting YOLOv5,transformer,and EfficientDet detectors for crop circle detection in desert[J].IEEE Geoscience and Remote Sensing Letters,2021,19:1-5. [25]LIU S,HUANG D,WANG Y.Learning spatial fusion for single-shot object detection[J].arXiv:1911.09516,2019. [26]LI S H.Legislative Reaction to Privacy Protection in the Digital Era[J].Law Science,2024(3):17-31. [27]SUN R,FA Y W,FENG H D,et al.Research progress on face presentation attack detection method based on deep learning[J].Computer Science,2025,52(2):323-335. [28]XIE Y J,YANG Y X,LIU T.Legal Risk Regulation of Personalized Recommendation Algorithms[J].Journal of University of Science and Technology Beijing(Social Sciences Edition),2024,40(1):77-85. [29]JIANG J,ZHANG Q,WANG C Y.A review of iris recognition based on deep learning[J].Computer Science and Exploration,2024,18(6):1421-1437. [30]CHEN Z B,ZHANG L H.Legal Regulation of Algorithm Technology:Governance Dilemma,Development Logic,and Optimization Path[J].China Journal of Applied Jurisprudence,2024(4):155-166. [31]CHEN H Y,MAO L H.An improved lightweight aerial target detection model based on YOLOv5s[J].Computer Science,2024,51(S2):475-482. [32]ZHENG Z,WANG P,LIU W,et al.Distance-IoU loss:Fasterand better learning for bounding box regression[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.2020:12993-13000. [33]KENDALL A,GAL Y,CIPOLLA R.Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7482-7491. [34]SHARMA D K.Information measure computation and its impact in mi coco dataset[C]//2021 7th International Conference on Advanced Computing and Communication Systems(ICACCS).IEEE,2021:1964-1969. [35]TONG K,WU Y.Rethinking PASCAL-VOC and MS-COCOdataset for small object detection[J].Journal of Visual Communication and Image Representation,2023,93:103830. [36]CHENG T,SONG L,GE Y,et al.Yolo-world:Real-time open-vocabulary object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:16901-16911. [37]HOU X,LIU M,ZHANG S,et al.Salience DETR:EnhancingDetection Transformer with Hierarchical Salience Filtering Refinement[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:17574-17583. [38]TAN M,PANG R,LE Q V.Efficientdet:Scalable and efficient object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:10781-10790. [39]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.YOLOv4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020. [40]XIA Q F.Reflection and Improvement of Personal Information Anonymization System[J].Law and Economy,2024(5):41-58. [41]ZHENG X J.On the Constitutional Normative Logic of Data Rights Protection[J].Human Rights,2024(6):68-89. [42]DAI X.Will No-fault Liability Rule for AI Harms Obstruct AI Advancement?Insights from Law and Economics[J].ECUPL Journal,2024,27(5):38-55. [43]YANG Z H.Another Possibility to Realize the AlgorithmTransparency:Explainable Artificial Intelligence[J].Administrative Law Review,2024(3):154-163. [44]CHEN Y,ZHANG H.Triple Security Challenges and Legal Regulations of Cross-Border Data Circulation in China-Based on the Analysis of Data Circulation Theory[J].Forum on Science and Technology in China,2024(8):64-73. |
|