Computer Science ›› 2026, Vol. 53 ›› Issue (3): 246-256.doi: 10.11896/jsjkx.241100165

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Student Behavior Detection Method Based on Improved YOLO Algorithm

WANG Xinyu1, GAO Donghuai2, NING Yuwen2, XU Hao2, QI Haonan1   

  1. 1 Network and Data Center, Northwest University, Xi’an 710127, China
    2 Teaching and Research Support Center, Air Force Medical University, Xi’an 710032, China
  • Received:2024-11-27 Revised:2025-08-20 Published:2026-03-12
  • About author:WANG Xinyu,born in 1999,postgra-duate.His main research interests include smart teaching and goal detection.
    NING Yuwen,born in 1984,associate professor,master’s supervisor.His mainresearch interests include intelligent development technology and application of teaching resources.
  • Supported by:
    Shaanxi Provincial Department of Science and Technology 2022 Key Research and Development Program(2022SF-068),Air Force Medical University Teaching Support Research Program(2022JB-03) and 2023 Shaanxi Undergraduate and Higher Continuing Education Teaching Reform Research Project(23BY206).

Abstract: In order to solve the problems of large scale variations,serious occlusions,and large computational burden that makes it difficult to popularize on a wide scale for student behavior detection in classroom scenarios,this paper proposes a lightweight student classroom behavior detection method BDEO-YOLO based on the improved YOLOv8.Firstly,dynamic convolution is introduced on the basis of YOLOv8n C2f module in YOLOv8,which enhances the model’s adaptability to complex classroom scenarios and feature expression ability.Secondly,the multi-scale feature fusion ability of the model is optimized by combining Bi-FPN and GLSA,and ELA mechanism is introduced in the Backbone part of the model,which enhances the model’s ability to detect small targets and detailed features.Finally,a lightweight detection head one13 structure is designed to simplify the feature extraction process and significantly reduce the computational burden of the model.Experimental results on the public dataset STBD-08 show that the mAP of the BDEO-YOLO model reaches 92.2%,which is 1.3 percentage points higher than that of the ori-ginal YOLOv8n,and the computational burden is reduced from 8.1 GFLOPs to 4.8 GFLOPs,which is 40.7% lower than the ori-ginal model,and the model size is only 5.7 MB,which verifies the effectiveness of the lightweight design.Validation on the public datasets SCB-Dataset3 and VOC2007 shows that the improved algorithm improves in all performance metrics,verifies the genera-lization ability of the model,and exhibits high robustness in dealing with occlusion,scale change,and illumination change in the classroom.

Key words: Student behavior detection, Lightweight, Dynamic convolution, BiFPN, Attention mechanism

CLC Number: 

  • TP391
[1]LIU Q T,HE H Y,WU L J,et al.Classroom Teaching Behavior Analysis Method Basde on Artificial Intelligence and Its Application[J].China Educational Technology,2019(9):9.
[2]GUO J Q,LYU J H,WANG R H,et al.Classroom behaviorrecognition driven by deep learning model[J].Journal of Beijing Normal University(Natural Science),2021,57(6):905-912.
[3]HUANG K Y,LIANG M Y,WANG X X,et al.Multi-person classroom action recognition in classroom teaching videos based on deep spatiotemporal residual convolution neural network[J].Journal of Computer Applications,2022,42(3):736-742.
[4]YAN X Y,KUANG Y X,BAI G R,et al.Student Classroom Behavior Recognition Method Based on Deep Learning[J].Computer Engineering,2023,49(7):251-258.
[5]CHEN H,ZHOU G,JIANG H.Student Behavior Detection in the Classroom Based on Improved YOLOv8[J].Sensors,2023,23(20):8385.
[6]TAN S Q,TANG G F,TU Y Y,et al.Classroom Monitoring Students Abnormal Behavior Detection System[J].Computer Engineering and Applications,2022,58(7):176-184.
[7]DAI J,LI Y,HE K,et al.R-FCN:Object Detection via Region-based Fully Convolutional Networks[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.2016:379-387.
[8]SHAOQING R,KAIMING H,ROSS G,et al.Faster R-CNN:Towards Real-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149.
[9]HE K,GKIOXARI G,DOLLAR P,et al.Mask R-CNN[C]//IEEE Transactions on Pattern Analysis & Machine Intelligence.IEEE,2017.
[10]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single Shot MultiBox Detector.[J].arXiv:1512.02325,2015.
[11]TSUNG-YI L,PRIYA G,ROSS G,et al.Focal Loss for Dense Object Detection.[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(2):318-327.
[12]REDMON J,DIVVALA K S,GIRSHICK B R,et al.You Only Look Once:Unified,Real-Time Object Detection.[J].arXiv:1506.02640,2015.
[13]HAN K,WANG Y,GUO J,et al.ParameterNet:ParametersAre All You Need for Large-scale Visual Pretraining of Mobile Networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:15751-15761.
[14]TAN M X,PAN R M,LE Q L.EfficientDet:Scalable and efficient object detection[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2020:10778-10787.
[15]TANG F,XU Z,HUANG Q,et al.DuAT:Dual-aggregationtransformer network for medical image segmentation[C]//Chinese Conference on Pattern Recognition and Computer Vision(PRCV).Singapore:Springer,2023:343-356.
[16]LIU S,QI L,QIN H,et al.Path Aggregation Network for Instance Segmentation.[J].arXiv:1803.01534,2018.
[17]XU W,WAN Y.ELA:Efficient Local Attention for Deep Con-volutional Neural Networks[J].arXiv:2403.01123,2024.
[18]ZHAO J D,ZHEN G Y,CHU C Q.Unmanned Aerial Vehicle Image Target Detection Algorithm Based on YOLOv8[J].Computer Engineering,2024,50(4):113-120.
[19]GE Z.YOLOX:Exceeding YOLO Series in 2021[J].arXiv:2107.08430,2021.
[20]WANG C Y,LIAO H Y M,WU Y H,et al.CSPNet:A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:390-391.
[21]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2023:7464-7475.
[22]LIN T Y,DOLLAR P,GIRSHICK R,et al.Feature Pyramid Networks for Object Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Computer Society,2017.
[23]ZHAO J,ZHU H.CBPH-Net:A small object detector for behavior recognition in classroom scenarios[J].IEEE Transactions on Instrumentation and Measurement,2023,72:2521112.
[24]CHEN J,KAO S,HE H,et al.Run,don’t walk:chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:12021-12031.
[25]OUYANG D,HE S,ZHANG G,et al.Efficient multi-scale attention module with cross-spatial learning[C]//ICASSP 2023-2023 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2023:1-5.
[26]MA X,DAI X,BAI Y,et al.Rewrite the Stars[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:5694-5703.
[27]CHEN Y,YUAN X,WU R,et al.Yolo-ms:rethinking multi-scale representation learning for real-time object detection[J].arXiv:2308.05480,2023.
[28]PENG Y,SONKA M,CHEN D Z.U-Net v2:Rethinking the skip connections of U-Net for medical image segmentation[J].arXiv:2311.17791,2023.
[29]LAU K W,PO L M,REHMAN Y A U.Large separable kernel attention:Rethinking the large kernel attention design in cnn[J].Expert Systems with Applications,2024,236:121352.
[30]CAI X,LAI Q,WANG Y,et al.Poly kernel inception network for remote sensing detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:27706-27716.
[31]LI C,LI L,JIANG H,et al.YOLOv6:A single-stage object detection framework for industrial applications[J].arXiv:2209.02976,2022.
[32]WANG C Y,BOCHKOVSKIY A,LIAO H Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:7464-7475.
[1] QIAN Qing, CHEN Huicheng, CUI Yunhe, TANG Ruixue, FU Jinmei. Joint Entity and Relation Extraction Method with Multi-scale Collaborative Aggregation and Axial-semantic Guidance [J]. Computer Science, 2026, 53(3): 97-106.
[2] GE Zeqing, HUANG Shengjun. Semi-supervised Learning Method for Multi-label Tabular Data [J]. Computer Science, 2026, 53(3): 151-157.
[3] CHANG Xuanwei, DUAN Liguo, CHEN Jiahao, CUI Juanjuan, LI Aiping. Method for Span-level Sentiment Triplet Extraction by Deeply Integrating Syntactic and Semantic
Features
[J]. Computer Science, 2026, 53(2): 322-330.
[4] ZHANG Jing, PAN Jinghao, JIANG Wenchao. Background Structure-aware Few-shot Knowledge Graph Completion [J]. Computer Science, 2026, 53(2): 331-341.
[5] ZHAI Jie, LI Yanhao, CHEN Lexuan, GUO Weibin. Dynamic Recommendation of Personalized Hands-on Learning Materials Based on LightweightEducational LLMs [J]. Computer Science, 2026, 53(2): 48-56.
[6] ZHUO Tienong, YING Di, ZHAO Hui. Research on Student Classroom Concentration Integrating Cross-modal Attention and Role
Interaction
[J]. Computer Science, 2026, 53(2): 67-77.
[7] XU Jingtao, YANG Yan, JIANG Yongquan. Time-Frequency Attention Based Model for Time Series Anomaly Detection [J]. Computer Science, 2026, 53(2): 161-169.
[8] HAN Lei, SHANG Haoyu, QIAN Xiaoyan, GU Yan, LIU Qingsong, WANG Chuang. Constrained Multi-loss Video Anomaly Detection with Dual-branch Feature Fusion [J]. Computer Science, 2026, 53(2): 236-244.
[9] GUO Xingxing, XIAO Yannan, WEN Peizhi, XU Zhi, HUANG Wenming. Attention-based Audio-driven Digital Face Video Generation Method [J]. Computer Science, 2026, 53(2): 245-252.
[10] JI Sai, QIAO Liwei, SUN Yajie. Semantic-guided Hybrid Cross-feature Fusion Method for Infrared and Visible Light Images [J]. Computer Science, 2026, 53(2): 253-263.
[11] LYU Jinggang, GAO Shuo, LI Yuzhi, ZHOU Jin. Facial Expression Recognition with Channel Attention Guided Global-Local Semantic Cooperation [J]. Computer Science, 2026, 53(1): 195-205.
[12] FAN Jiabin, WANG Baohui, CHEN Jixuan. Method for Symbol Detection in Substation Layout Diagrams Based on Text-Image MultimodalFusion [J]. Computer Science, 2026, 53(1): 206-215.
[13] WANG Haoyan, LI Chongshou, LI Tianrui. Reinforcement Learning Method for Solving Flexible Job Shop Scheduling Problem Based onDouble Layer Attention Network [J]. Computer Science, 2026, 53(1): 231-240.
[14] CHEN Qian, CHENG Kaixuan, GUO Xin, ZHANG Xiaoxia, WANG Suge, LI Yanhong. Bidirectional Prompt-Tuning for Event Argument Extraction with Topic and Entity Embeddings [J]. Computer Science, 2026, 53(1): 278-284.
[15] PENG Jiao, HE Yue, SHANG Xiaoran, HU Saier, ZHANG Bo, CHANG Yongjuan, OU Zhonghong, LU Yanyan, JIANG dan, LIU Yaduo. Text-Dynamic Image Cross-modal Retrieval Algorithm Based on Progressive Prototype Matching [J]. Computer Science, 2025, 52(9): 276-281.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!