Computer Science ›› 2024, Vol. 51 ›› Issue (10): 56-66.doi: 10.11896/jsjkx.240400109

• Technology and Application of Intelligent Education • Previous Articles     Next Articles

Perception and Analysis of Teaching Process Based on Video Understanding

DUAN Xinran, WANG Mei, HAN Tianli, ZHOU Hongyu, GUO Junqi, JI Weixing, HUANG Hua   

  1. School of Artificial Intelligence,Beijing Normal University,Beijing 100875,China
  • Received:2024-04-15 Revised:2024-06-27 Online:2024-10-15 Published:2024-10-11
  • About author:DUAN Xinran,born in 2001,undergra-duate.His main research interests include computer vision and machine learning.
    HUANG Hua,born in 1975,professor,Ph.D supervisor,is a member of CCF(No.09499D).His main research intere-sts include video processing and computer graphics.
  • Supported by:
    National Natural Science Foundation of China(62306043).

Abstract: The classroom serves as the core battleground for education.Monitoring and evaluating teachers' instructional activities in the classroom is an effective means of improving the quality of teaching.However,existing manual evaluation methods suffer from drawbacks such as low efficiency,potential disruption of classroom dynamics,and subjective errors,making it difficult to achieve satisfactory results.Given the rapid development of artificial intelligence(AI) technology,it is proposed to integrate human-centered intelligent analysis techniques into teachers' instructional processes for real-time recognition and analysis of tea-chers.First,a facial detection algorithm is employed to locate the teacher's position and estimate their movements.Second,a gaze estimation algorithm is utilized to detect the teachers' focal points.Lastly,skeleton-based action recognition and facial expression recognition are employed to perceive and analyze teachers' actions and expressions.Quantitative statistics on the indicators provide a more efficient and objective understanding of teachers' teaching characteristics,so as to help teachers improve their tea-ching quality.As experimented in the same configuration environment,the modules of the system perform well in the correspon-ding tasks and fulfill the requirements in teaching scenarios.From the evaluation results on real-world teaching videos,the system is designed to accurately perceive the teachers' instructional states,providing constructive feedback for enhancing teaching quality.

Key words: Teaching quality assessment, Video understanding, Displacement estimation, Gaze estimation, Action recognition, Facial expression recognition

CLC Number: 

  • TP391
[1]KONG Y,FU Y.Human action recognition and prediction:Asurvey[J].International Journal of Computer Vision,2022,130(5):1366-1401.
[2]LI S,DENG W H.Deep facial expression recognition:A survey[J].IEEE Transactions on Affective Computing,2020,13(3):1195-1215.
[3]DENG J K,GUO J,VERVERAS E,et al.Retinaface:Single-shot multi-level face localisation in the wild[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2020:5203-5212.
[4]FU D R,ZHANG H M.Educational information processing[M].Beijing:Beijing Normal University Press,2011.
[5]COHEN R B E G.Analyzing teaching behavior[J].American Educational Research Journal,1970,8(3):589-592.
[6]SUN F Q,DENG C X.Study of Emotion Evaluation in Classroom Learning Based on Artificial Intelligence [J].Chinese Journal of ICT in Education,2019(23):58-62.
[7]CHENG Y H,WU R.Student Facial Expression RecognitionMethod Based on Residual Neural Network [J].China Compu-ter & Communication,2018(33):45-47.
[8]HAN L,LI Y,ZHOU Z J,et al.Teaching effect analysis basedon the facial expression recognition in classroom [J].Modern Distance Education Research,2017(4):97-103.
[9]JIA L Y,ZHANG C H,ZHAO X Y,et al.Analysis of students status in class based on artificial intelligence and video proces-sing [J].Modern Educational Technology,2019,29(12):82-88.
[10]ZHONG M C,ZHANG J L,YANG L B,et al.Study on Online Education Focus Degree Based on Face Detection and Fuzzy Comprehensive Evaluation [J].Computer Science,2020,47(S2):196-203.
[11]TAN B,YANG S H.Research on the Algorithm of Students' Classroom Behavior Detection Based on Faster R-CNN [J].Modern Computer,2018(33):45-47.
[12]GAO Y.Analysis of Classroom Teaching Behavior based onSpace-time Map Convolution Network [J].Journal of Xinjiang Normal University,2023,42(1):89-96.
[13]SHI J Y.The Impact of AI-Enabled Teaching Gestures onClassroom Engagement:A Study [J].Computer Knowledge and Technology,2024,20(3):19-21.
[14]SCHROFF F,KALENICHENKO D,PHILBIN J.Facenet:Aunified embedding for face recognition and clustering[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2015:815-823.
[15]DENG J K,GUO J,XUE N N,et al.Arcface:Additive angular margin loss for deep face recognition[C]//IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.IEEE,2019:4690-4699.
[16]CAO K D,RONG Y,LI C,et al.Pose-robust face recognition via deep residual equivariant mapping[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:5187-5196.
[17]QI D L,TAN W J,YAO Q,et al.YOLO5Face:Why reinventing a face detector[C]//European Conference on Computer Vision.IEEE,2022:228-244.
[18]VERVERAS E,GKAGKOS P,DENG J K,et al.3DGazeNet:Generalizing Gaze Estimation with Weak-Supervision from Synthetic Views[J].arXiv:2212.02997,2022.
[19]ABDELRAHMAN A A,HEMPEL T,KHALIFA A,et al.L2cs-net:Fine-grained gaze estimation in unconstrained environments[C]//2023 8th International Conference on Frontiers of Signal Processing(ICFSP).IEEE,2023:98-102.
[20]SUN K,XIAO B,LIU D,et al.Deep high-resolution representation learning for human pose estimation[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2019:5693-5703.
[21]YANG Z,ZENG A,YUAN C,et al.Effective whole-body pose estimation with two-stages distillation[C]//Proceedings of the IEEE International Conference on Computer Vision.2023:4210-4220.
[22]CHENG K,ZHANG Y F,HE X Y,et al.Skeleton-based action recognition with shift graph convolutional network[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2020:183-192.
[23]SONG Y F,ZHANG Z,SHAN C,et al.Constructing stronger and faster baselines for skeleton-based action recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(2):1474-1488.
[24]LEE J,LEE M,LEE D,et al.Hierarchically decomposed graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.2023:10444-10453.
[25]PHAM L,VU T H,TRAN T A.Facial expression recognition using residual masking network[C]//International Conference on Pattern Recognition.IEEE,2021:4513-4519.
[26]ZHAO Z Q,LIU Q S,WANG S M.Learning deep global multi-scale and local attention features for facial expression recognition in the wild[J].IEEE Transactions on Image Processing,2021,30:6544-6556.
[27]WANG K,PENG X J,YANG J F,et al.Suppressing uncertainties for large-scale facial expression recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2020:6897-6906.
[28]SHAHROUDY A,LIU J,NG T T,et al.Ntu rgb+ d:A large scale dataset for 3d human activity analysis[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:1010-1019.
[29]LIU J,SHAHROUDY A,PEREZ M,et al.Ntu rgb+ d 120:A large-scale benchmark for 3d human activity understanding[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,42(10):2684-2701.
[30]JIANG P Y,ERGU D,LIU F Y,et al.A Review of Yolo algorithm developments[J].Procedia Computer Science,2022,199:1066-1073.
[31]DU Y H,ZHAO Z C,SONG Y,et al.Strongsort:Make deepsort great again[J].IEEE Transactions on Multimedia,2023,25:8725-8737.
[32]YAN S,XIONG Y,LIN D.Spatial temporal graph convolutional networks for skeleton-based action recognition[C]//AAAI Conference on Artificial Intelligence.2018:7444-7452.
[33]EPANECHNIKOV V A.Non-parametric estimation of a multivariate probability density[J].Theory of Probability & Its Applications,1969,14(1):153-158.
[34]ANG S,LUO P,LOY C C,et al.Wider face:A face detection benchmark[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:5525-5533.
[35]VIOLA P,JONES M.Rapid object detection using a boostedcascade of simple features[C]//Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR2001).2001.
[36]LI J,WANG Y,WANG C,et al.Dsfd:dual shot face detector[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5060-5069.
[37]ZHANG K,ZHANG Z,LI Z,et al.Joint face detection andalignment using multitask cascaded convolutional networks[J].IEEE Signal Processing Letters,2016,23(10):1499-1503.
[38]KELLNHOFER P,RECASENS A,STENT S,et al.Gaze360:Physically unconstrained gaze estimation in the wild[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:6912-6921.
[39]GOODFELLOW I J,ERHAN D,CARRIER P L,et al.Challenges in representation learning:A report on three machine lear-ning contests[C]//Neural Information Processing:20th International Conference(ICONIP 2013).2013:117-124.
[1] LEI Yongsheng, DING Meng, SHEN Yao, LI Juhao, ZHAO Dongyue, CHEN Fushi. Action Recognition Model Based on Improved Two Stream Vision Transformer [J]. Computer Science, 2024, 51(7): 229-235.
[2] ZHANG Huazhong, PAN Yuekai, TU Xiaoguang, LIU Jianhua, XU Luopeng, ZHOU Chao. Facial Expression Recognition Integrating 3D Facial Dynamic Information and Optical Flow Information [J]. Computer Science, 2024, 51(6A): 230700210-7.
[3] YAN Wenjie, YIN Yiying. Human Action Recognition Algorithm Based on Adaptive Shifted Graph Convolutional Neural
Network with 3D Skeleton Similarity
[J]. Computer Science, 2024, 51(4): 236-242.
[4] XU Jinlong, DONG Mingrui, LI Yingying, LIU Yanqing, HAN Lin. Eye Gaze Estimation Network Based on Class Attention [J]. Computer Science, 2024, 51(10): 295-301.
[5] LI Jia'nan, LI Ruiyi, ZHAO Zhifu, SONG Juan, HAN Jialong, ZHU Tong. Recognition and Analysis of Teaching Behavior Based on Multi-scale GCN [J]. Computer Science, 2024, 51(10): 135-143.
[6] LUO Huilan, YU Yawei, WANG Chanjuan. Multi-dimensional Feature Excitation Network for Video Action Recognition [J]. Computer Science, 2023, 50(11A): 230300115-8.
[7] LI Hua, ZHAO Lingdi, CHEN Yujie, YANG Yang, DU Xinzhao. Lightweight Graph Convolution Action Recognition Algorithm Based on Multi-streamFusion [J]. Computer Science, 2023, 50(11A): 220800147-6.
[8] WU Yushan, XU Zengmin, ZHANG Xuelian, WANG Tao. Self-supervised Action Recognition Based on Skeleton Data Augmentation and Double Nearest Neighbor Retrieval [J]. Computer Science, 2023, 50(11): 97-106.
[9] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[10] ZHANG Hong-bo, DONG Li-jia, PAN Yu-biao, HSIAO Tsung-chih, ZHANG Hui-zhen, DU Ji-xiang. Survey on Action Quality Assessment Methods in Video Understanding [J]. Computer Science, 2022, 49(7): 79-88.
[11] XIE Yu, YANG Rui-ling, LIU Gong-xu, LI De-yu, WANG Wen-jian. Human Skeleton Action Recognition Algorithm Based on Dynamic Topological Graph [J]. Computer Science, 2022, 49(2): 62-68.
[12] MIAO Qi-guang, XIN Wen-tian, LIU Ru-yi, XIE Kun, WANG Quan, YANG Zong-kai. Graph Convolutional Skeleton-based Action Recognition Method for Intelligent Behavior Analysis [J]. Computer Science, 2022, 49(2): 156-161.
[13] LI Bao-zhen, ZHANG Jin, WANG Bao-lu, YU Ping. Human-Object Interaction Recognition Integrating Multi-level Visual Features [J]. Computer Science, 2022, 49(11A): 220700012-8.
[14] GAN Chuang, WU Gui-xing, ZHAN Qing-yuan, WANG Peng-kun, PENG Zhi-lei. Multi-scale Gated Graph Convolutional Network for Skeleton-based Action Recognition [J]. Computer Science, 2022, 49(1): 181-186.
[15] LIU Xin, YUAN Jia-bin, WANG Tian-xing. Interior Human Action Recognition Method Based on Prior Knowledge of Scene [J]. Computer Science, 2022, 49(1): 225-232.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!