计算机科学 ›› 2026, Vol. 53 ›› Issue (3): 246-256.doi: 10.11896/jsjkx.241100165
王鑫钰1, 高东怀2, 宁玉文2, 许浩2, 齐浩楠1
WANG Xinyu1, GAO Donghuai2, NING Yuwen2, XU Hao2, QI Haonan1
摘要: 为了解决课堂情景下学生行为检测因尺度变化大、遮挡严重、计算负担大而难以大范围普及等问题,提出了一种基于改进YOLOv8的轻量化学生课堂行为检测方法BDEO-YOLO。首先,在YOLOv8n的基础上引入动态卷积(Dynamic Convolution)对YOLOv8中的C2f模块进行改进,增强了模型对课堂复杂场景的适应性和特征表达能力。其次,通过结合双向特征金字塔网络(Bidirectional Feature Pyramid Network,BiFPN)和全局局部空间聚合(Global-to-Local Spatial Aggregation,GLSA)模块,优化了模型的多尺度特征融合能力,在模型的Backbone部分引入了高效局部注意力(Efficient Local Attention,ELA)机制,增强了模型对小目标和细节特征的检测能力。最后,设计了轻量化的检测头one13结构,简化了特征提取过程,大幅降低了模型的计算负担。在公开数据集STBD-08上的实验结果表明,BDEO-YOLO模型的mAP达到92.2%,比原始YOLOv8n提高了1.3个百分点,计算量从8.1 GFLOPs降低至4.8 GFLOPs,比原模型降低了40.7%,模型大小仅有5.7MB,验证了轻量化设计的有效性。在公开数据集SCB-Dataset3和VOC2007上进行验证,改进后的算法在各项性能指标上均有所提升,验证了模型的泛化能力,其在处理课堂中的遮挡、尺度变化和光照变化等问题上表现出较高的鲁棒性。
中图分类号:
| [1]LIU Q T,HE H Y,WU L J,et al.Classroom Teaching Behavior Analysis Method Basde on Artificial Intelligence and Its Application[J].China Educational Technology,2019(9):9. [2]GUO J Q,LYU J H,WANG R H,et al.Classroom behaviorrecognition driven by deep learning model[J].Journal of Beijing Normal University(Natural Science),2021,57(6):905-912. [3]HUANG K Y,LIANG M Y,WANG X X,et al.Multi-person classroom action recognition in classroom teaching videos based on deep spatiotemporal residual convolution neural network[J].Journal of Computer Applications,2022,42(3):736-742. [4]YAN X Y,KUANG Y X,BAI G R,et al.Student Classroom Behavior Recognition Method Based on Deep Learning[J].Computer Engineering,2023,49(7):251-258. [5]CHEN H,ZHOU G,JIANG H.Student Behavior Detection in the Classroom Based on Improved YOLOv8[J].Sensors,2023,23(20):8385. [6]TAN S Q,TANG G F,TU Y Y,et al.Classroom Monitoring Students Abnormal Behavior Detection System[J].Computer Engineering and Applications,2022,58(7):176-184. [7]DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.2016:379-387. [8]SHAOQING R,KAIMING H,ROSS G,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [9]HE K,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[C]//IEEE Transactions on Pattern Analysis & Machine Intelligence.IEEE,2017. [10]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single Shot MultiBox Detector.[J].arXiv:1512.02325,2015. [11]TSUNG-YI L,PRIYA G,ROSS G,et al.Focal Loss for Dense Object Detection.[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(2):318-327. [12]REDMON J,DIVVALA K S,GIRSHICK B R,et al.You Only Look Once:Unified,Real-Time Object Detection.[J].arXiv:1506.02640,2015. [13]HAN K,WANG Y,GUO J,et al.ParameterNet:ParametersAre All You Need for Large-scale Visual Pretraining of Mobile Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:15751-15761. [14]TAN M X,PAN R M,LE Q L.EfficientDet:Scalable and efficient object detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2020:10778-10787. [15]TANG F,XU Z,HUANG Q,et al.DuAT:Dual-aggregationtransformer network for medical image segmentation[C]//Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Singapore:Springer,2023:343-356. [16]LIU S,QI L,QIN H,et al.Path Aggregation Network for Instance Segmentation.[J].arXiv:1803.01534,2018. [17]XU W,WAN Y.ELA:Efficient Local Attention for Deep Con-volutional Neural Networks[J].arXiv:2403.01123,2024. [18]ZHAO J D,ZHEN G Y,CHU C Q.Unmanned Aerial Vehicle Image Target Detection Algorithm Based on YOLOv8[J].Computer Engineering,2024,50(4):113-120. [19]GE Z.YOLOX:Exceeding YOLO Series in 2021[J].arXiv:2107.08430,2021. [20]WANG C Y,LIAO H Y M,WU Y H,et al.CSPNet:A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:390-391. [21]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2023:7464-7475. [22]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2017. [23]ZHAO J,ZHU H.CBPH-Net:A small object detector for behavior recognition in classroom scenarios[J].IEEE Transactions on Instrumentation and Measurement,2023,72:2521112. [24]CHEN J,KAO S,HE H,et al.Run,don’t walk:chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:12021-12031. [25]OUYANG D,HE S,ZHANG G,et al.Efficient multi-scale attention module with cross-spatial learning[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2023:1-5. [26]MA X,DAI X,BAI Y,et al.Rewrite the Stars[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:5694-5703. [27]CHEN Y,YUAN X,WU R,et al.Yolo-ms:rethinking multi-scale representation learning for real-time object detection[J].arXiv:2308.05480,2023. [28]PENG Y,SONKA M,CHEN D Z.U-Net v2:Rethinking the skip connections of U-Net for medical image segmentation[J].arXiv:2311.17791,2023. [29]LAU K W,PO L M,REHMAN Y A U.Large separable kernel attention:Rethinking the large kernel attention design in cnn[J].Expert Systems with Applications,2024,236:121352. [30]CAI X,LAI Q,WANG Y,et al.Poly kernel inception network for remote sensing detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:27706-27716. [31]LI C,LI L,JIANG H,et al.YOLOv6:A single-stage object detection framework for industrial applications[J].arXiv:2209.02976,2022. [32]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475. |
|
||