计算机科学 ›› 2026, Vol. 53 ›› Issue (5): 41-49.doi: 10.11896/jsjkx.250600186

• 智能教育技术 • 上一篇    下一篇

教师课堂行为智能分析与理解研究进展

潘玮莹1, 李雨桐1, 马苗1,2   

  1. 1 陕西师范大学人工智能与计算机学院 西安 710119
    2 现代教学技术教育部重点实验室 西安 710062
  • 收稿日期:2025-06-26 修回日期:2025-09-16 发布日期:2026-05-08
  • 通讯作者: 马苗(mmthp@snnu.edu.cn)
  • 作者简介:(panwy@snnu.edu.cn)
  • 基金资助:
    国家自然科学基金(62377031);中国教育学会教育科研重点规划课题(202426676908A)

Intelligent Analysis and Understanding Research Progress on Teachers’ Classroom Behavior

PAN Weiying1, LI Yutong1, MA Miao1,2   

  1. 1 School of Artificial Intelligence and Computer Science, Shaanxi Normal University, Xi’an 710119, China
    2 Key Laboratory of Modern Teaching Technology, Ministry of Education, Xi’an 710062, China
  • Received:2025-06-26 Revised:2025-09-16 Online:2026-05-08
  • About author:PAN Weiying,born in 2002,postgra-duate.Her main research interests include intelligent educational technology and so on.
    MA Miao,born in 1977,Ph.D,professor,Ph.D supervisor.Her main research interests include image processing,videoanalysis and smart education.
  • Supported by:
    National Natural Science Foundation of China(62377031) and Key Research Project of the Chinese Society of Education(202426676908A).

摘要: 在我国数字化转型背景下,以人工智能、大数据、物联网、云计算为代表的智能技术为教育改革注入了新动能。教师课堂行为是教师专业素养与教学能力的重要外在表征之一。借助视频分析、语音处理和文本分析等智能技术,可以对教学行为的特征与规律进行自动化表征,从而为构建教师数字画像提供关键指标。对此,综述了教师课堂行为的智能分析与理解研究进展。首先从教师言语行为、非言语行为以及言语和非言语行为结合3个维度构建智能分析指标体系;然后从单模态和跨模态角度分别梳理总结教师课堂言语行为和非言语行为识别与理解中的智能技术、代表方法和应用实践;最后讨论当前研究在教师言语行为事件检测、言语行为和非言语行为的语义对齐和跨模态信息协同等方面的挑战,以及在教师言语行为事件建模、跨模态语义融合及多模态大模型应用等方面的未来趋势,这对于教师数字画像构建中的教育教学能力评价、职业发展规划等有着重要的研究意义和实用价值。

关键词: 教师课堂行为, 言语行为, 非言语行为, 行为分析与理解, 人工智能技术

Abstract: Against the backdrop of digital transformation in China,intelligent technologies represented by artificial intelligence,big data,the Internet of Things,and cloud computing have injected new impetus into education reform.Teachers’ classroom behavior is one of the important external manifestations of teachers’ professional qualities and teaching abilities.With the help of intelligent technologies such as integrated video analysis,speech processing,and text analysis,the characteristics and laws of teaching behaviors can be automatically represented,which is an important indicator for constructing digital portraits of teachers.This paper reviews the research progress of intelligent analysis and understanding of teachers’ classroom behavior.Firstly,an indicator system for intelligent analysis is constructed from three dimensions:teachers’ verbal behavior,non-verbal behavior,and the combination of verbal and non-verbal behavior.Then,the intelligent technologies,representative methods,and application practices in the recognition and understanding of teachers’ classroom verbal and non-verbal behaviors are summarized and sorted out from the perspectives of single modality and cross-modality.Finally,the current research challenges in the detection of tea-chers’ verbal behavior events,the semantic alignment of verbal and non-verbal behaviors,and the cross-modal information collaboration are discussed,as well as the future trends in the modeling of teachers’ verbal behavior events,semantic cross-modal fusion,and multi-modal large model application,which have important research significance and practical value for the evaluation of educational and teaching abilities and career development planning in the construction of teachers’ digital portraits.

Key words: Teachers’ classroom behavior, Verbal behavior, Nonverbal behavior, Behavior analysis and understanding, Artificial intelligence technology

中图分类号: 

  • TP391
[1]GOU J F,YANG J M,LI R N,et al.Improving the digital tea-ching competence for pre-service teachers by intelligent technology:the role of expectancy-value beliefs[J].China Educational Technology,2024,8:56-61,86.
[2]PENG J,WU N Z.Artificial intelligence empowering integrated development of teachers:logical framework generative pathways[J].Modern Educational Technology,2024,34(10):23-31.
[3]Overall Plan for Deepening the Reform of Educational Evalua-tion in the New Era[EB/OL].http://www.moe.gov.cn/ jyb_xxgk/moe_1777/moe_1778/202010/t20201013_494381.html.
[4]YANG X Z,REN Y Q.The next step of artificial intelligence in education-application scenarios and promotion strategies[J].China Educational Technology,2021,408(1):89-95.
[5]CHENG Y,WANG Y L,TANG W L,et al.An analysis method of teachers’ teaching behavior control ability based on cloud model[J].Modern Educational Technology,2020,30(12):85-91.
[6]HU Q T,WEI M.Research on teachers’ teaching behaviors indigital classroom from the perspective of phenomenology[J].e-Education Research,2025,46(3):79-86.
[7]WANG C.Research on teachers’ classroom teaching perfor-mance based on teaching behavior[D].Shanghai:East China Normal University,2022.
[8]GAO W.A study on the interaction of teacher-student verbalbehaviors in classroom teaching[J].Educational Research and Experiment,2009(5):43-49.
[9]HU X Y,XU K X,ZHANG Y B.Precise competency assessment and visualized presentation for teacher portraits[J].China Educational Technology,2024,444(1):104-110.
[10]STAHNKE R,BLÖMEKE S.Novice and expert teachers’ noticing of classroom management in whole-group and partner work activities:evidence from teachers’ gaze and identification of events[J].Learning and Instruction,2021,74(8):101464.
[11]ZHANG X L.Research on voiceprint signal clustering algorithm for speaker segmentation[D].Qinhuangdao:Yanshan University,2023.
[12]XIE W,NAGRANI A,CHUNG S,et al.Utterance-level aggregation for speaker recognition in the wild[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.2019:5791-5795.
[13]TAO R,LEE A,DAS K,et al.Self-supervised speaker recognition with loss-gated learning[C]//IEEE International Confe-rence on Acoustics,Speech and Signal Processing.2022:6142-6146.
[14]CHEN Z Z,SHI Y W,WANG M K,et al.Research on the analysis and evaluation of teachers’ attention supported by intelligent technology[J].Modern Educational Technology,2024,34(4):100-111.
[15]CHAN W,JAITLY N,LE Q,et al.Listen,attend and spell:a neural network for large vocabulary conversational speech re-cognition[C]//IEEE International Conference on Acoustics,Speech and Signal Processing.2016:4960-4964.
[16]DONG L,XU S,XU B.Speech-transformer:a no-recurrence sequence-to-sequence model for speech recognition[C]//IEEE International Conference on Acoustics,Speech and Signal Proces-sing.2018:5884-5888.
[17]ZHU W J.Research on extraction and recognition technology of teacher behavior with the characteristics of teaching behavior[D].Wuhan:Central China Normal University,2021.
[18]YANG Q,JIN W,ZHANG Q,et al.Mixed-modality speech recognition and interaction using a wearable artificial throat[J].Nature Machine Intelligence,2023,5(2):169-180.
[19]OU Z G,LIU Y P,LI R L,et al.Research on teacher speech emotion recognition in international Chinese language classrooms[J].Modern Educational Technology,2023,33(8):87-95.
[20]KIM Y.Convolutional neural networks for sentence classification[C]//Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics.2014:1746-1751.
[21]LIU P,QIU X,HUANG X.Recurrent neural network for text classification with multi-task learning[C]//The 25th International Joint Conference on Artificial Intelligence.2016:2873-2879.
[22]GAO Y W,WANG Z J,HU L L,et al.Long text categorization for decision impact evaluation of scientific dataset[J].Journal of Beijing University of Posts and Telecommunications,2025,48(2):8-17.
[23]MA Y H,XIA X Y,ZHANG W H.Research on analysis method of teachers’ classroom questioning based on deep learning[J].e-Education Research,2021,42(9):108-114.
[24]SHAO F H.Research on teaching behavior detection based on text and visual data[D].Wuhan:Central China Normal University,2023.
[25]SHEN C.Construction and implementation of automatic textevaluation model for classroom teaching reflection based on text classification[D].Chengdu:Southwest University,2023.
[26]LIN Z R,YAN H B.Research on automatic identification ofteachers’ micro-competence in informatization teaching based on pre-training[J].e-Education Research,2023,44(3):115-121.
[27]WANG Y F,ZHANG X F.Modality fusion strategy researchbased on multimodal video classification task[J].Computer Science,2024,51(S1):489-493.
[28]CUI Y,ZHOU W,XU W,et al.Classroomactivity detectionfrom audio-text information fusion[C]//The 7th International Conference on Big Data and Artificial Intelligence.2024:181-186.
[29]NING J,SUN Y,XU B,et al.Breaking the boundaries:a unified framework for Chinese named entity recognition across text and speech[C]//Conference on Empirical Methods in Natural Language Processing,Association for Computational Linguistics.2024:1250-1260.
[30]YANG J M,PI Z L,ZHANG Y,et al.The role of teachers’ eye gaze in instructional videos:an eye-tracking study[J].China Educational Technology,2020,404(9):22-29.
[31]LANGNER A,GRAULICH N,NIED M.Eye-tracking as apromising tool in pre-service teacher education-a new approach to promote skills for digital multimedia design[J].Journal of Chemical Education,2022,99(4):1651-1659.
[32]XUE Y F,ZHU F Q.Construction of cognitive style modelbased on machine learning and eye tracking[J].Modern Distance Education Research,2024,36(4):94-103.
[33]LU X Y.Research on facial expression recognition based ondeep learning[J].Information Technology and Informatization,2024(11):198-201.
[34]ZHANG S,YANG Y,CHEN C,et al.Deep learning-based multimodal emotion recognition from audio,visual,and text modalities:a systematic review of recent advancements and future prospects[J].Expert Systems with Applications,2024,237(3):121692.
[35]YAO A,CAI D,HU P,et al.HoloNet:towards robust emotion recognition in the wild[C]//The 18th ACM International Conference on Multimodal Interaction.2016:472-478.
[36]AN X,DENG J,GUO J,et al.Killing two birds with one stone:efficient and robust training of face recognition CNNs by partial FC[C]//IEEE Conference on Computer Vision and Pattern Recognition.2022:4042-4051.
[37]LEE J,KIM S,KIM S,et al.Multi-modal recurrent attentionnetworks for facial expression recognition[J].IEEE Transactions on Image Processing,2020,29(5):6977-6991.
[38]JIA L Y,ZHANG Z H,ZHAO X Y,et al.Analysis of students’ status in class based on artificial intelligence and video proces-sing[J].Modern Educational Technology,2019,29(12):82-88.
[39]JIANG J,DENG W.Boosting facial expression recognition by a semi-supervised progressive teacher[J].IEEE Transactions on Affective Computing,2021,14(3):2402-2414.
[40]ZHONG H,HAN T,XIA W,et al.Research on real-time tea-chers’ facial expression recognition based on YOLOv5 and attention mechanisms[J].EURASIP Journal on Advances in Signal Processing,2023,2023(1):55.
[41]GUAN T T,ZHANG L Q,PENG C Y.Analysis on teachers’classroom teaching behavior in the perspective of artificial intelligence-taking physics class as an example[J].Digital Education,2023,9(5):54-61.
[42]DUAN X R,WANG M,HAN T L,et al.Perception and analysis of teaching process based on video understanding[J].Computer Science,2024,51(10):56-66.
[43]WANG Z.Development and application of teaching behavioranalysis system based on gesture and attention recognition[D].Wuhan:Central China Normal University,2024.
[44]LI Y,WEI G,DESROSIERS C,et al.Decoupled and boosted learning for skeleton-based dynamic hand gesture recognition[J].Pattern Recognition,2024,153(9):110536.
[45]YUAN G,BING R,LIU X,et al.Spatial-temporal graph neural network based hand gesture recognition[J].Acta Electronica Sinica,2022,50(4):921-931.
[46]STUDDERT-KENNEDY M.Hand and mind:what gestures reveal about thought[J].Language and Speech,1994,37(2):203-209.
[47]ALIBALI M W,NATHAN M J,WOLFGRAM M S,et al.How teachers link ideas in mathematics instruction using speech and gesture:a corpus analysis[J].Cognition and Instruction,2014,32(1):65-100.
[48]KAPITANOV A,KVANCHIANI K,NAGAEV A,et al.Ha-GRID-hand gesture recognition image dataset[C]//IEEE/CVF Winter Conference on Applications of Computer Vision.2024:4572-4581.
[49]WAKEFIELD E,NOVACK M A,CONGDON E L,et al.Gesture helps learners learn,but not merely by guiding their visual attention[J].Developmental Science,2018,21(6):e12664.
[50]LIU T T.Research on intelligent perception and recognitionmethod for quantification of teachers’ credibility[D].Wuhan:Central China Normal University,2019.
[51]SONG Y Q,WU L W,ZHAO Y Q,et al.High-accuracy gesture recognition using mm-wave radar based on convolutional block attention module[C]//IEEE International Conference on Image Processing.2023:1485-1489.
[52]WANG L C,ZHANG Z X,FU W L,et al.Gesture recognition method based on continuous wavelet transform and improved CBAM[J].Computer Engineering and Applications,2025,61(11):185-194.
[53]HUANG K,LIU S J,ZOU Z,et al.A hierarchical graph con-volutional method for gesture recognition based on temporal-frequency data fusion[J/OL].https://kns.cnki.net/kcms/Detail/Detail.aspx?doi=11.2925.TP.20250214.1631.035.
[54]MURPHY C E,TRIVEDI M M.Head pose estimation in computer vision:A survey[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,31(4):607-626.
[55]MEYER G P,GUPTA S,FROSIO I,et al.Robust model-based 3D head pose estimation[C]//IEEE International Conference on Computer Vision.2015:3649-3657.
[56]YAO J J.End-to-end trainable head pose estimation with vision transformer based on multi-scale dilated separable convolution[J].Modeling and Simulation,2025,14(3):426-434.
[57]LU Y Y,CHEN Z Z,CHEN R,et al.Research on the application framework of intelligent technologies to promote teachers’ classroom teaching behaviors evaluation[J].Modern Educational Technology,2022,32(12):76-84.
[58]MIN Q S,LIU N,CHENY T,et al.Head pose estimation based on facial feature point localization[J].Computer Engineering,2018,44(6):263-269.
[59]LI J,LIU K,WU J.Ego-body pose estimation via ego-head pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2023:17142-17151.
[60]DENG Y N,LUO J X,JIN F L.Overview of human pose estimation methods based on deep learning[J].Computer Enginee-ring and Applications,2019,55(19):22-42.
[61]SUN K,XIAO B,LIU D,et al.Deep high-resolution representation learning for human pose estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:5686-5696.
[62]LI J,XU C,CHEN Z,et al.Hybrik:a hybrid analytical-neural inverse kinematics solution for 3D human pose and shape estimation[C]//IEEE Conference on Computer Vision and Pattern Recognition.2021:3383-3393.
[63]TOSHEV A,SZEGEDY C.Deeppose:human pose estimationvia deep neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition.2014:1653-1660.
[64]ZHAO L,PENG X,TIAN Y,et al.Semantic graph convolu-tional networks for 3D human pose regression[C]//IEEE Conference on Computer Vision and Pattern Recognition.2019:3420-3430.
[65]NIE X,FENG J,ZHANG J,et al.Single-stage multi-person pose machines[C]//IEEE International Conference on Computer Vision.2019:6951-6960.
[66]WANG P,LI W,LI C,et al.Action recognition based on jointtrajectory maps with convolutional neural networks[J].Know-ledge-Based Systems,2018,158(10):43-53.
[67]WENG J,WENG C,YUAN J.Spatio-temporal naive-bayes nearest-neighbor for skeleton-based action recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.2017:4171-4180.
[68]LIANG Y J.Research on teacher behavior recognition methodbased on classroom videos[D].Shenyang:Northeast Normal University,2024.
[69]XU T,GUO J,HONG M,et al.A fast teacher pose estimation framework base on kernelized correlation filter and spatial transformed high-resolution network[J].Procedia Computer Science,2020,174:393-399.
[70]LI Y Z.The research on teacher behavior recognition based on pose sequences[D].Sanya:Hainan Normal University,2023.
[71]YE Y,WANG J,HE P,et al.An action analysis algorithm for teachers based on human pose estimation[J].Computers and Electrical Engineering,2023,111(10):108915.
[72]PI Z,ZHANG Y,ZHU F,et al.Instructors’ pointing gestures improve learning regardless of their use of directed gaze in video lectures[J].Computers & Education,2019,128(1):345-352.
[73]YANG J M,ZHANG Y,LI L,et al.The interaction of an in-structor’s guided behaviors and learners’ prior knowledge in predicting learning from video[J].China Educational Technology,2019,390(7):74-81.
[74]CHEN S H.Research and application of quantitative computational methods for teachers’ non-verbal behaviors in intelligent classroom environment[D].Wuhan:Central China Normal University,2020.
[75]SMITH J.Understanding teachers’ nonverbal behaviors in online teaching:evidence from eye-tracking and facial expression analysis[J].Journal of Educational Psychology,2021,112(3):459-471.
[76]WANG L,HUANG Y,ZHOU H.Nonverbal teacher immediacy behaviors and online teaching effectiveness:the mediating role of learner engagement[J].Journal of Educational Computing Research,2021,59(1):129-151.
[77]PI Z,LIU W,LING H,et al.Does an instructor’s facial expressions override their body gestures in video lectures?[J].Computers & Education,2023,193(2):104679.
[78]GANDHI A,ADHVARYU K,PORIA S,et al.Multimodal sentiment analysis:a systematic review of history,datasets,multimodal fusion methods,applications,challenges and future directions[J].Information Fusion,2023,91(3):424-444.
[79]PETKOVIĆ U,FRENKEL J,HELLWICHO,et al.Nonverbal immediacy analysis in education:a multimodal computational model[C]//International Conference on Simulation of Adaptive Behavior.2024:326-338.
[80]MA X Y.Research and application of teachers’ teaching behavior recognition for smart classroom[D].Kunming:Yunnan Normal University,2023.
[81]ZHENG Q,CHEN Z,WANG M,et al.Automated multi-mode teaching behavior analysis:a pipeline based event segmentation and description[J].IEEE Transactions on Learning Technologies,2024,17(5):1677-1693.
[82]WANG M K,CHEN ZZ,SHI Y W,et al.Design and application effect of the multimodal interactive teaching evaluation framework supported by intelligent technology[J].Modern Educational Technology,2024,34(9):91-101.
[83]YIN B Y,WANG X J,SUN X,et al.Multimodal decoding ofteachers’ classroom management behavior:behavioral characteristics,classification recognition and temporal development[J].e-Education Research,2024,45(10):101-109.
[84]FANG H G,HONG X,SHUL L,et al.Analysis framework of teachers’ teaching competency based on classroom intelligent analysis large model and its application research[J].Modern Educational Technology,2024,34(2):43-52.
[85]XIE Y Q,ZHANG Y H,XU K X,et al.Development and application of digital teacher portrait empowered by large multimodal model[J].Open Education Research,2025,31(1):100-109.
[86]LIU Z,DONG Y,WANG J,et al.Ola:pushing the frontiers of omni-modal language model with progressive modality alignment[EB/OL].https://arxiv.org/pdf/2502.04328.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!