Computer Science ›› 2026, Vol. 53 ›› Issue (2): 57-66.doi: 10.11896/jsjkx.250500100

• Educational Data Mining Based on Graph Machine Learning • Previous Articles     Next Articles

CPViG-Net:Students’ Classroom Behavior Recognition Based on Cross-stage Visual GraphConvolution

ZHANG Haopeng1, SHI Zheng2, LIU Feng1,2, SONG Wanru2   

  1. 1 School of Communication and Information Engineering,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
    2 School of Educational Science and Technology,Nanjing University of Posts and Telecommunications,Nanjing 210023,China
  • Received:2025-05-26 Revised:2025-09-15 Published:2026-02-10
  • About author:ZHANG Haopeng,born in 2001,postgraduate.His main research interests include image processing and AI foreducation.
    SONG Wanru,born in 1992,Ph.D.Her main research interests include image processing,pattern recognition and AI for education.
  • Supported by:
    National Natural Science Foundation of China(62307025,62177029) and Project of EducationTeaching Reform of Nanjing University of Posts and Telecommunications(JG01723JX71).

Abstract: With the evolution of educational paradigms from “human-computer collaboration” to “human-intelligence collaborative co-education”,the intelligent evaluation of teaching is also facing new requirements and challenges.In recent years,the task that takes student behavior as the starting point has gained widespread attention.Aiming at the challenges of diverse student beha-viors,heavy occlusions and severe background interference in real classroom environments,a cross-stage partial vision graph network(CPViG-Net) is proposed to enhance the accuracy of student behavior detection in complex classroom settings.Based on a classic object detection framework,the model integrates the dynamic feature modeling ability of the vision GNN and constructs the partial max-relative graph convolution(PMG) module and the cross-stage partial fusion(CPF) module.The PMG module captures the neighborhood information with the greatest feature differences between nodes by embedding maximum relative graph convolution,thereby specifically addressing the issue of information loss caused by local occlusions.It also incorporates depthwise separable convolution to reduce the computational cost of the graph convolution algorithm.The CPF module reconstructs the feature structure using fully connected layers and leverages the cross-stage connection mechanism of the C2f module to achieve multi-level feature fusion,thereby enhancing the ability of the model to recognize small-scale objects.In addition,the model proposes optimization strategies for different datasets through the optimization of nearest neighbor K values.On the public dataset SCB03-S,the mAP@50 of CPViG-Net reaches 70.9%,which is a 2 percentage points improvement over the baseline model.Experiments on multiple publicly available datasets demonstrate that the model exhibits good performance and high robustness in addressing the various challenges of student behavior recognition in real classroom scenarios.

Key words: Student behavior, Max-relative graph convolution, Multi-scale object recognition, Occlusion, Depthwise separable convolution

CLC Number: 

  • TP391
[1]ZHU Z T,HAN Z M,HUANG C Q.Educational Artificial Intelligence(eAI):A new paradigm of human-centered artificial intelligence[J].e-Education Research,2021,42(1):5-15.
[2]SINGH H,MIAH S J.Smart education literature:A theoretical analysis[J].Education and Information Technologies,2020,25(4):3299-3328.
[3]HUANG T,ZHANG Z M,LIU S Y.Coexistence for Symbiosis:How Human-Intelligence Collaborative Co-education is Possible[J].Educational Research,2025,46(1):147-159.
[4]ROMERO C,VENTURA S.Educational data mining:a review of the state of the art[J].IEEE Transactions on Systems,Man,and Cybernetics,Part C(Applications and Reviews),2010,40(6):601-618.
[5]YU M,XU J,ZHONG J,et al.Behavior detection and analysis for learning process in classroom environment[C]//2017 IEEE Frontiers in Education Conference(FIE).IEEE,2017:1-4.
[6]ZHANG X F.The Transformation of Traditional EducationalEvaluation:Educational Evaluation Based on the Theory of Multiple Intelligences[J].Educational Science Research,2002(4):28-30.
[7]ZHAO J,ZHU H.Cbph-net:A small object detector for beha-vior recognition in classroom scenarios[J].IEEE Transactions on Instrumentation and Measurement,2023,72:1-12.
[8]TAN S Y,WANG Z X,HE G D.Real-time Panoramic Multi-scale Classroom Behaviors Recognition Based on CA-YOLOv9 Network[J].Modern Educational Technology,2024,34(7):123-130.
[9]LIAO S B,QI F.Research on machine analysis of classroomteacher-student interaction behavior[J].Journal of Central China Normal University(Natural Sciences),2024,58(2):279-285.
[10]WANG D Q,LIU H,QIU M L.Analysis Method and Application Verification on Teacher Behavior Data in Smart Classroom[J].China Educational Technology,2020(5):120-127.
[11]YADAV D K,KUMARI N,HARRON S.Advances in Convolutional Neural Networks for Object Detection and Recognition[C]//2024 International Conference on Optimization Computing and Wireless Communication(ICOCWC).IEEE,2024:1-6.
[12]XIAO H,LIU X D.Real-time acquisition and dynamic analysis of learning state based on hybrid intelligence[J].Journal of Jilin University(Engineering and Technology Edition),2025,55(7):2402-2408.
[13]LI X W,YE J W,ZHANG Q H.On the Index Model ofTeacher-Student Interaction Behaviors Under the Background of “Internet+Teaching”[J].Research in Higher Education of Engineering,2020(3):157-162
[14]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:Towardsreal-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(6):1137-1149.
[15]ZENG C,YAN K,WANG Z,et al.Abs-CAM:a gradient optimization interpretable approach for explanation of convolutional neural networks[J].Signal,Image and Video Processing,2023,17(4):1069-1076.
[16]WANG Z,WANG Z,ZENG C,et al.High-quality image compressed sensing and reconstruction with multi-scale dilated convolutional neural network[J].Circuits,Systems,and Signal Processing,2023,42(3):1593-1616.
[17]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[18]BOCHKOVSKIY A,WANG C Y,LIAO H Y M.Yolov4:Optimal speed and accuracy of object detection[J].arXiv:2004.10934,2020.
[19]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference.Springer,2016:21-37.
[20]CHU Q,OUYANG W,LI H,et al.Online multi-object tracking using CNN-based single object tracker with spatial-temporal attention mechanism[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:4836-4845.
[21]CHENG J,TANG Y,HE C,et al.Rethinking Variational Bayes in Community Detection From Graph Signal Perspective[J].IEEE Transactions on Knowledge and Data Engineering,2025,37(5):2903-2917.
[22]UTTARKABAT S,NAYAK S,CHAUDHURI S P,et al.e-Framework for m-Health Detection and Control Using GNN[C]//IECON 2023-49th Annual Conference of the IEEE Industrial Electronics Society.IEEE,2023:1-6.
[23]HUANG X,HUANG C.NGD:Filtering graphs for visual analysis[J].IEEE Transactions on Big Data,2016,4(3):381-395.
[24]CAO Z,SIMON T,WEI S E,et al.Realtime multi-person 2d pose estimation using part affinity fields[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:7291-7299.
[25]LI J N,LI R Y,ZHAO Z F,et al.Recognition and Analysis of Teaching Behavior Based on Multi-scale GCN[J].Computer Science,2024,51(10):135-143.
[26]CHI H,HA M H,CHI S,et al.Infogcn:Representation learning for human skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:20186-20196.
[27]YANG W,ZHANG J,CAI J,et al.HybridNet:Integrating GCN and CNN for skeleton-based action recognition[J].Applied Intelligence,2023,53(1):574-585.
[28]HAN K,WANG Y,GUO J,et al.Vision gnn:An image is worth graph of nodes[J].Advances in Neural Information Processing Systems,2022,35:8291-8303.
[29]SOUDEEP S,MRIDHA M F,JAHIN M A,et al.DGNN-YOLO:Dynamic Graph Neural Networks with YOLO11 for Small Object Detection and Tracking in Traffic Surveillance[J].arXiv:2411.17251,2024.
[30]WANG C Y,LIAO H Y M,WU Y H,et al.CSPNet:A new backbone that can enhance learning capability of CNN[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2020:390-391.
[31]YAN Q G,ZHAO J,CHA X G,et al.The Evaluation of Tea-ching Effect based on Interpretable Machine Learning[C]//2021 8th International Conference on Dependable Systems and Their Applications(DSA).IEEE,2021:712-715.
[32]JIANG H,WANG H.Designing and Implementing an Intelligent Machine Learning-Based Evaluation System for Assessing English Teaching Quality in Vocational Education[C]//2024 International Conference on Interactive Intelligent Systems and Techniques(IIST).IEEE,2024:36-40.
[33]WANG X Y,GAO D H,NING Y W,et al.Research on Lightweight Student Behavior Detection Method Based on Improved YOLO Algorithm[J/OL].Computer Science,1-15.
[34]MUNIR M,AVERY W,MARCULESCU R.Mobilevig:Graph-based sparse attention for mobile vision applications[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:2211-2219.
[35]YANG F,WANG T,WANG X.Student classroom behavior detection basedon YOLOv7+ BRA and multi-model fusion[C]//International Conference on Image and Graphics.Cham:Sprin-ger,2023:41-52.
[36]CHEN J,KAO S,HE H,et al.Run,don’t walk:chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:12021-12031.
[37]MA X,DAI X,BAI Y,et al.Rewrite the stars[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:5694-5703.
[38]ZHU P,WEN L,DU D,et al.Detection and tracking meetdrones challenge[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,44(11):7380-7399.
[39]GE Z,LIU S,WANG F,et al.Yolox:Exceeding yolo series in 2021[J].arXiv:2107.08430,2021.
[40]ZHAO Y,LYU W,XU S,et al.Detrs beat yolos on real-time object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2024:16965-16974.
[41]PENG Y,LI H,WU P,et al.D-FINE:redefine regression Task in DETRs as Fine-grained distribution refinement[J].arXiv:2410.13842,2024.
[1] LEI Shuai, QIU Mingxin, LIU Xianhui, ZHANG Yingyao. Image Classification Model for Waste Household Appliance Recycling Based on Multi-scaleDepthwise Separable ResNet [J]. Computer Science, 2025, 52(6A): 240500057-7.
[2] DUAN Pengsong, ZHANG Yihang, FANG Tao, CAO Yangjie, WANG Chao. WiLCount:A Lightweight Crowd Counting Model for Wireless Perception Scenarios [J]. Computer Science, 2025, 52(10): 317-327.
[3] LIU Sichun, WANG Xiaoping, PEI Xilong, LUO Hangyu. Scene Segmentation Model Based on Dual Learning [J]. Computer Science, 2024, 51(8): 133-142.
[4] LOU Zhengzheng, ZHANG Xin, HU Shizhe, WU Yunpeng. Foggy Weather Object Detection Method Based on YOLOX_s [J]. Computer Science, 2024, 51(7): 206-213.
[5] LI Xiaoxin, DING Weijie, FANG Yi, ZHANG Yuancheng, WANG Qihui. Occluded Face Recognition Based on Deep Image Prior and Robust Markov Random Field [J]. Computer Science, 2024, 51(7): 244-256.
[6] LIU Yunqing, WU Yue, ZHANG Qiong, YAN Fei, CHEN Shanshan. Road Crack Detection Based on Separable Convolution and Wave Transform Fusion [J]. Computer Science, 2024, 51(11A): 240100141-9.
[7] CHEN Guojun, YUE Xueyan, ZHU Yanning, FU Yunpeng. Study on Building Extraction Algorithm of Remote Sensing Image Based on Multi-scale Feature Fusion [J]. Computer Science, 2023, 50(9): 202-209.
[8] SUN Chang-di, PAN Zhi-song, ZHANG Yan-yan. Re-lightweight Method of MobileNet Based on Low-cost Deformable Convolution [J]. Computer Science, 2022, 49(12): 312-318.
[9] LI Jia-zhen, JI Qing-ge. Dynamic Low-sampling Ambient Occlusion Real-time Ray Tracing for Molecular Rendering [J]. Computer Science, 2022, 49(1): 175-180.
[10] ZHAO Dong-mei, SONG Hui-qian, ZHANG Hong-bin. Network Security Situation Based on Time Factor and Composite CNN Structure [J]. Computer Science, 2021, 48(12): 349-356.
[11] YU Lu, HU Jian-feng, YAO Lei-yue. Correlation Filter Object Tracking Algorithm Based on Global and Local Block Cooperation [J]. Computer Science, 2020, 47(6): 157-163.
[12] FAN Rong-rong, FAN Jia-qing, LIU Qing-shan. Real-time High-confidence Update Complementary Learner Tracking [J]. Computer Science, 2019, 46(3): 137-141.
[13] ZHAO Guang-hui, ZHUO Song, XU Xiao-long. Multi-object Tracking Algorithm Based on Kalman Filter [J]. Computer Science, 2018, 45(8): 253-257.
[14] LI Xiao-xin, ZHOU Yuan-shen, ZHOU Xuan, LI Jing-jing, LIU Zhi-yong. Gabor Occlusion Dictionary Learning via Singular Value Decomposition [J]. Computer Science, 2018, 45(6): 275-283.
[15] DU Jing-wen, HUANG Shan and YANG Shuang-xiang. Meanshift Target Tracking Algorithm of Adaptive HLBP Texture Feature [J]. Computer Science, 2017, 44(Z11): 217-220.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!