Computer Science ›› 2026, Vol. 53 ›› Issue (5): 41-49.doi: 10.11896/jsjkx.250600186

• Intelligent Education Technology • Previous Articles     Next Articles

Intelligent Analysis and Understanding Research Progress on Teachers’ Classroom Behavior

PAN Weiying1, LI Yutong1, MA Miao1,2   

  1. 1 School of Artificial Intelligence and Computer Science, Shaanxi Normal University, Xi’an 710119, China
    2 Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi’an 710062, China
  • Received:2025-06-26 Revised:2025-09-16 Published:2026-05-08
  • About author:PAN Weiying,born in 2002,postgra-duate.Her main research interests include intelligent educational technology and so on.
    MA Miao,born in 1977,Ph.D,professor,Ph.D supervisor.Her main research interests include image processing,videoanalysis and smart education.
  • Supported by:
    National Natural Science Foundation of China(62377031) and Key Research Project of the Chinese Society of Education(202426676908A).

Abstract: Against the backdrop of digital transformation in China,intelligent technologies represented by artificial intelligence,big data,the Internet of Things,and cloud computing have injected new impetus into education reform.Teachers’ classroom behavior is one of the important external manifestations of teachers’ professional qualities and teaching abilities.With the help of intelligent technologies such as integrated video analysis,speech processing,and text analysis,the characteristics and laws of teaching behaviors can be automatically represented,which is an important indicator for constructing digital portraits of teachers.This paper reviews the research progress of intelligent analysis and understanding of teachers’ classroom behavior.Firstly,an indicator system for intelligent analysis is constructed from three dimensions:teachers’ verbal behavior,non-verbal behavior,and the combination of verbal and non-verbal behavior.Then,the intelligent technologies,representative methods,and application practices in the recognition and understanding of teachers’ classroom verbal and non-verbal behaviors are summarized and sorted out from the perspectives of single modality and cross-modality.Finally,the current research challenges in the detection of tea-chers’ verbal behavior events,the semantic alignment of verbal and non-verbal behaviors,and the cross-modal information collaboration are discussed,as well as the future trends in the modeling of teachers’ verbal behavior events,semantic cross-modal fusion,and multi-modal large model application,which have important research significance and practical value for the evaluation of educational and teaching abilities and career development planning in the construction of teachers’ digital portraits.

Key words: Teachers’ classroom behavior, Verbal behavior, Nonverbal behavior, Behavior analysis and understanding, Artificial intelligence technology

CLC Number: 

  • TP391
[1]GOU J F,YANG J M,LI R N,et al.Improving the digital tea-ching competence for pre-service teachers by intelligent technology:the role of expectancy-value beliefs[J].China Educational Technology,2024,8:56-61,86.
[2]PENG J,WU N Z.Artificial intelligence empowering integrated development of teachers:logical framework generative pathways[J].Modern Educational Technology,2024,34(10):23-31.
[3]Overall Plan for Deepening the Reform of Educational Evalua-tion in the New Era[EB/OL].http://www.moe.gov.cn/ jyb_xxgk/moe_1777/moe_1778/202010/t20201013_494381.html.
[4]YANG X Z,REN Y Q.The next step of artificial intelligence in education-application scenarios and promotion strategies[J].China Educational Technology,2021,408(1):89-95.
[5]CHENG Y,WANG Y L,TANG W L,et al.An analysis method of teachers’ teaching behavior control ability based on cloud model[J].Modern Educational Technology,2020,30(12):85-91.
[6]HU Q T,WEI M.Research on teachers’ teaching behaviors indigital classroom from the perspective of phenomenology[J].e-Education Research,2025,46(3):79-86.
[7]WANG C.Research on teachers’ classroom teaching perfor-mance based on teaching behavior[D].Shanghai:East China Normal University,2022.
[8]GAO W.A study on the interaction of teacher-student verbalbehaviors in classroom teaching[J].Educational Research and Experiment,2009(5):43-49.
[9]HU X Y,XU K X,ZHANG Y B.Precise competency assessment and visualized presentation for teacher portraits[J].China Educational Technology,2024,444(1):104-110.
[10]STAHNKE R,BLÖMEKE S.Novice and expert teachers’ noticing of classroom management in whole-group and partner work activities:evidence from teachers’ gaze and identification of events[J].Learning and Instruction,2021,74(8):101464.
[11]ZHANG X L.Research on voiceprint signal clustering algorithm for speaker segmentation[D].Qinhuangdao:Yanshan University,2023.
[12]XIE W,NAGRANI A,CHUNG S,et al.Utterance-level aggregation for speaker recognition in the wild[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.2019:5791-5795.
[13]TAO R,LEE A,DAS K,et al.Self-supervised speaker recognition with loss-gated learning[C]//IEEE International Confe-rence on Acoustics,Speech and Signal Processing.2022:6142-6146.
[14]CHEN Z Z,SHI Y W,WANG M K,et al.Research on the analysis and evaluation of teachers’ attention supported by intelligent technology[J].Modern Educational Technology,2024,34(4):100-111.
[15]CHAN W,JAITLY N,LE Q,et al.Listen,attend and spell:a neural network for large vocabulary conversational speech re-cognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.2016:4960-4964.
[16]DONG L,XU S,XU B.Speech-transformer:a no-recurrence sequence-to-sequence model for speech recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Proces-sing.2018:5884-5888.
[17]ZHU W J.Research on extraction and recognition technology of teacher behavior with the characteristics of teaching behavior[D].Wuhan:Central China Normal University,2021.
[18]YANG Q,JIN W,ZHANG Q,et al.Mixed-modality speech recognition and interaction using a wearable artificial throat[J].Nature Machine Intelligence,2023,5(2):169-180.
[19]OU Z G,LIU Y P,LI R L,et al.Research on teacher speech emotion recognition in international Chinese language classrooms[J].Modern Educational Technology,2023,33(8):87-95.
[20]KIM Y.Convolutional neural networks for sentence classification[C]//Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics.2014:1746-1751.
[21]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]//The 25th International Joint Conference on Artificial Intelligence.2016:2873-2879.
[22]GAO Y W,WANG Z J,HU L L,et al.Long text categorization for decision impact evaluation of scientific dataset[J].Journal of Beijing University of Posts and Telecommunications,2025,48(2):8-17.
[23]MA Y H,XIA X Y,ZHANG W H.Research on analysis method of teachers’ classroom questioning based on deep learning[J].e-Education Research,2021,42(9):108-114.
[24]SHAO F H.Research on teaching behavior detection based on text and visual data[D].Wuhan:Central China Normal University,2023.
[25]SHEN C.Construction and implementation of automatic textevaluation model for classroom teaching reflection based on text classification[D].Chengdu:Southwest University,2023.
[26]LIN Z R,YAN H B.Research on automatic identification ofteachers’ micro-competence in informatization teaching based on pre-training[J].e-Education Research,2023,44(3):115-121.
[27]WANG Y F,ZHANG X F.Modality fusion strategy researchbased on multimodal video classification task[J].Computer Science,2024,51(S1):489-493.
[28]CUI Y,ZHOU W,XU W,et al.Classroomactivity detectionfrom audio-text information fusion[C]//The 7th International Conference on Big Data and Artificial Intelligence.2024:181-186.
[29]NING J,SUN Y,XU B,et al.Breaking the boundaries:a unified framework for Chinese named entity recognition across text and speech[C]//Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics.2024:1250-1260.
[30]YANG J M,PI Z L,ZHANG Y,et al.The role of teachers’ eye gaze in instructional videos:an eye-tracking study[J].China Educational Technology,2020,404(9):22-29.
[31]LANGNER A,GRAULICH N,NIED M.Eye-tracking as apromising tool in pre-service teacher education-a new approach to promote skills for digital multimedia design[J].Journal of Chemical Education,2022,99(4):1651-1659.
[32]XUE Y F,ZHU F Q.Construction of cognitive style modelbased on machine learning and eye tracking[J].Modern Distance Education Research,2024,36(4):94-103.
[33]LU X Y.Research on facial expression recognition based ondeep learning[J].Information Technology and Informatization,2024(11):198-201.
[34]ZHANG S,YANG Y,CHEN C,et al.Deep learning-based multimodal emotion recognition from audio,visual,and text modalities:a systematic review of recent advancements and future prospects[J].Expert Systems with Applications,2024,237(3):121692.
[35]YAO A,CAI D,HU P,et al.HoloNet:towards robust emotion recognition in the wild[C]//The 18th ACM International Conference on Multimodal Interaction.2016:472-478.
[36]AN X,DENG J,GUO J,et al.Killing two birds with one stone:efficient and robust training of face recognition CNNs by partial FC[C]//IEEE Conference on Computer Vision and Pattern Recognition.2022:4042-4051.
[37]LEE J,KIM S,KIM S,et al.Multi-modal recurrent attentionnetworks for facial expression recognition[J].IEEE Transactions on Image Processing,2020,29(5):6977-6991.
[38]JIA L Y,ZHANG Z H,ZHAO X Y,et al.Analysis of students’ status in class based on artificial intelligence and video proces-sing[J].Modern Educational Technology,2019,29(12):82-88.
[39]JIANG J,DENG W.Boosting facial expression recognition by a semi-supervised progressive teacher[J].IEEE Transactions on Affective Computing,2021,14(3):2402-2414.
[40]ZHONG H,HAN T,XIA W,et al.Research on real-time tea-chers’ facial expression recognition based on YOLOv5 and attention mechanisms[J].EURASIP Journal on Advances in Signal Processing,2023,2023(1):55.
[41]GUAN T T,ZHANG L Q,PENG C Y.Analysis on teachers’classroom teaching behavior in the perspective of artificial intelligence-taking physics class as an example[J].Digital Education,2023,9(5):54-61.
[42]DUAN X R,WANG M,HAN T L,et al.Perception and analysis of teaching process based on video understanding[J].Computer Science,2024,51(10):56-66.
[43]WANG Z.Development and application of teaching behavioranalysis system based on gesture and attention recognition[D].Wuhan:Central China Normal University,2024.
[44]LI Y,WEI G,DESROSIERS C,et al.Decoupled and boosted learning for skeleton-based dynamic hand gesture recognition[J].Pattern Recognition,2024,153(9):110536.
[45]YUAN G,BING R,LIU X,et al.Spatial-temporal graph neural network based hand gesture recognition[J].Acta Electronica Sinica,2022,50(4):921-931.
[46]STUDDERT-KENNEDY M.Hand and mind:what gestures reveal about thought[J].Language and Speech,1994,37(2):203-209.
[47]ALIBALI M W,NATHAN M J,WOLFGRAM M S,et al.How teachers link ideas in mathematics instruction using speech and gesture:a corpus analysis[J].Cognition and Instruction,2014,32(1):65-100.
[48]KAPITANOV A,KVANCHIANI K,NAGAEV A,et al.Ha-GRID-hand gesture recognition image dataset[C]//IEEE/CVF Winter Conference on Applications of Computer Vision.2024:4572-4581.
[49]WAKEFIELD E,NOVACK M A,CONGDON E L,et al.Gesture helps learners learn,but not merely by guiding their visual attention[J].Developmental Science,2018,21(6):e12664.
[50]LIU T T.Research on intelligent perception and recognitionmethod for quantification of teachers’ credibility[D].Wuhan:Central China Normal University,2019.
[51]SONG Y Q,WU L W,ZHAO Y Q,et al.High-accuracy gesture recognition using mm-wave radar based on convolutional block attention module[C]//IEEE International Conference on Image Processing.2023:1485-1489.
[52]WANG L C,ZHANG Z X,FU W L,et al.Gesture recognition method based on continuous wavelet transform and improved CBAM[J].Computer Engineering and Applications,2025,61(11):185-194.
[53]HUANG K,LIU S J,ZOU Z,et al.A hierarchical graph con-volutional method for gesture recognition based on temporal-frequency data fusion[J/OL].https://kns.cnki.net/kcms/Detail/Detail.aspx?doi=11.2925.TP.20250214.1631.035.
[54]MURPHY C E,TRIVEDI M M.Head pose estimation in computer vision:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,31(4):607-626.
[55]MEYER G P,GUPTA S,FROSIO I,et al.Robust model-based 3D head pose estimation[C]//IEEE International Conference on Computer Vision.2015:3649-3657.
[56]YAO J J.End-to-end trainable head pose estimation with vision transformer based on multi-scale dilated separable convolution[J].Modeling and Simulation,2025,14(3):426-434.
[57]LU Y Y,CHEN Z Z,CHEN R,et al.Research on the application framework of intelligent technologies to promote teachers’ classroom teaching behaviors evaluation[J].Modern Educational Technology,2022,32(12):76-84.
[58]MIN Q S,LIU N,CHENY T,et al.Head pose estimation based on facial feature point localization[J].Computer Engineering,2018,44(6):263-269.
[59]LI J,LIU K,WU J.Ego-body pose estimation via ego-head pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2023:17142-17151.
[60]DENG Y N,LUO J X,JIN F L.Overview of human pose estimation methods based on deep learning[J].Computer Enginee-ring and Applications,2019,55(19):22-42.
[61]SUN K,XIAO B,LIU D,et al.Deep high-resolution representation learning for human pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:5686-5696.
[62]LI J,XU C,CHEN Z,et al.Hybrik:a hybrid analytical-neural inverse kinematics solution for 3D human pose and shape estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2021:3383-3393.
[63]TOSHEV A,SZEGEDY C.Deeppose:human pose estimationvia deep neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition.2014:1653-1660.
[64]ZHAO L,PENG X,TIAN Y,et al.Semantic graph convolu-tional networks for 3D human pose regression[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:3420-3430.
[65]NIE X,FENG J,ZHANG J,et al.Single-stage multi-person pose machines[C]//IEEE International Conference on Computer Vision.2019:6951-6960.
[66]WANG P,LI W,LI C,et al.Action recognition based on jointtrajectory maps with convolutional neural networks[J].Know-ledge-Based Systems,2018,158(10):43-53.
[67]WENG J,WENG C,YUAN J.Spatio-temporal naive-bayes nearest-neighbor for skeleton-based action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.2017:4171-4180.
[68]LIANG Y J.Research on teacher behavior recognition methodbased on classroom videos[D].Shenyang:Northeast Normal University,2024.
[69]XU T,GUO J,HONG M,et al.A fast teacher pose estimation framework base on kernelized correlation filter and spatial transformed high-resolution network[J].Procedia Computer Science,2020,174:393-399.
[70]LI Y Z.The research on teacher behavior recognition based on pose sequences[D].Sanya:Hainan Normal University,2023.
[71]YE Y,WANG J,HE P,et al.An action analysis algorithm for teachers based on human pose estimation[J].Computers and Electrical Engineering,2023,111(10):108915.
[72]PI Z,ZHANG Y,ZHU F,et al.Instructors’ pointing gestures improve learning regardless of their use of directed gaze in video lectures[J].Computers & Education,2019,128(1):345-352.
[73]YANG J M,ZHANG Y,LI L,et al.The interaction of an in-structor’s guided behaviors and learners’ prior knowledge in predicting learning from video[J].China Educational Technology,2019,390(7):74-81.
[74]CHEN S H.Research and application of quantitative computational methods for teachers’ non-verbal behaviors in intelligent classroom environment[D].Wuhan:Central China Normal University,2020.
[75]SMITH J.Understanding teachers’ nonverbal behaviors in online teaching:evidence from eye-tracking and facial expression analysis[J].Journal of Educational Psychology,2021,112(3):459-471.
[76]WANG L,HUANG Y,ZHOU H.Nonverbal teacher immediacy behaviors and online teaching effectiveness:the mediating role of learner engagement[J].Journal of Educational Computing Research,2021,59(1):129-151.
[77]PI Z,LIU W,LING H,et al.Does an instructor’s facial expressions override their body gestures in video lectures?[J].Computers & Education,2023,193(2):104679.
[78]GANDHI A,ADHVARYU K,PORIA S,et al.Multimodal sentiment analysis:a systematic review of history,datasets,multimodal fusion methods,applications,challenges and future directions[J].Information Fusion,2023,91(3):424-444.
[79]PETKOVIĆ U,FRENKEL J,HELLWICHO,et al.Nonverbal immediacy analysis in education:a multimodal computational model[C]//International Conference on Simulation of Adaptive Behavior.2024:326-338.
[80]MA X Y.Research and application of teachers’ teaching behavior recognition for smart classroom[D].Kunming:Yunnan Normal University,2023.
[81]ZHENG Q,CHEN Z,WANG M,et al.Automated multi-mode teaching behavior analysis:a pipeline based event segmentation and description[J].IEEE Transactions on Learning Technologies,2024,17(5):1677-1693.
[82]WANG M K,CHEN ZZ,SHI Y W,et al.Design and application effect of the multimodal interactive teaching evaluation framework supported by intelligent technology[J].Modern Educational Technology,2024,34(9):91-101.
[83]YIN B Y,WANG X J,SUN X,et al.Multimodal decoding ofteachers’ classroom management behavior:behavioral characteristics,classification recognition and temporal development[J].e-Education Research,2024,45(10):101-109.
[84]FANG H G,HONG X,SHUL L,et al.Analysis framework of teachers’ teaching competency based on classroom intelligent analysis large model and its application research[J].Modern Educational Technology,2024,34(2):43-52.
[85]XIE Y Q,ZHANG Y H,XU K X,et al.Development and application of digital teacher portrait empowered by large multimodal model[J].Open Education Research,2025,31(1):100-109.
[86]LIU Z,DONG Y,WANG J,et al.Ola:pushing the frontiers of omni-modal language model with progressive modality alignment[EB/OL].https://arxiv.org/pdf/2502.04328.
[1] LI Pengqi, DING Lizhong, ZHANG Chunhui, FU Jiarun. Rethinking Deep Generalization Mechanisms:Establishment of Uniform Convergence Bounds Under Overparameterization and High-dimensional Noise Perturbations [J]. Computer Science, 2026, 53(4): 33-39.
[2] LI Hui, LIU Shujuan, JU Mingmei, WANG Jiepeng, JI Yingsong. High Frequency-Dense Quantum Gate Set Optimization Algorithm for Quantum Circuit in NISQ Era [J]. Computer Science, 2026, 53(4): 112-120.
[3] LI Jing, DU Shengdong, SHI Haochen, HU Jie, YANG Yan, LI Tianrui. Pre-trained Spatio-Temporal Decoupling-based Traffic Flow Prediction Model [J]. Computer Science, 2026, 53(4): 155-162.
[4] CHANG Wenxia, ZHANG Chao, LI Wentao, ZHAN Jianming, LI Deyu. Modeling of Behavior-guided Multi-scale Bi-level Group Consensus Under Social Networks [J]. Computer Science, 2026, 53(4): 180-187.
[5] LI Xilong, LIU Yan, JIA Mengmeng, ZHANG Zilin. NMTF-based Adaptive Algorithm for Community Detection in Complex Networks [J]. Computer Science, 2026, 53(4): 215-223.
[6] WANG Jinghong, LI Pengchao, MI Jusheng, WANG Wei. Multi-channel Graph Kolmogorov-Arnold Network Based on WL Graph Core [J]. Computer Science, 2026, 53(4): 224-234.
[7] CHENG Zimeng, YANG Xinyue, AI Haojun, WANG Zhongyuan. Unsupervised Infrared Image Generation Method Based on Dual Semantic Contrastive Learning [J]. Computer Science, 2026, 53(4): 260-268.
[8] ZHOU Haojie, WU Xiaoning, GAO Zhiqiang, HAN Rui, ZHANG Qinglong, LIU Chi, CHEN Zheng, ZHAO Yu, WANG Shuo. LegoViT:Block-grained Scaling Techniques for ViT Models in Edge-side Visual Inference [J]. Computer Science, 2026, 53(4): 269-276.
[9] ZHANG Xinfeng, GUO Yihai, LIU Xiaomin, XU Zhonghe, LI Xiangsheng. White Matter High Signal Segmentation Method Combining Local and Global Perception and Semantic Flow Alignment [J]. Computer Science, 2026, 53(4): 291-298.
[10] SONG Jianhua, LIU Chun, ZHANG Yan. Lightweight Camouflaged Object Detection Model Based on Structured Knowledge Distillation [J]. Computer Science, 2026, 53(4): 299-307.
[11] ZHAN Qiwei, REN Haojia, XIAO Tiantian. Improved Facial Animation Generation Algorithm Based on EchoMimic and Its Application Specifications [J]. Computer Science, 2026, 53(4): 326-336.
[12] LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[13] XU Shenjian. Cross-model Collaborative Unsupervised Representation Method for Legal Texts [J]. Computer Science, 2026, 53(4): 356-365.
[14] WU Qiaorui, LUO Li, ZHAO Cairong. LLM-augmented Training Framework with Cycle-Consistency Constraints [J]. Computer Science, 2026, 53(4): 377-383.
[15] PENG Juhong, ZHANG Zhengyue, DING Zixu, FAN Xinyu, HU Changyu, ZHAO Mingjun. Multi-view Local Language Feature and Global Feature Fusion for Conversational Aspect-based Sentiment Quadruple Analysis [J]. Computer Science, 2026, 53(4): 384-392.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!