计算机科学 ›› 2025, Vol. 52 ›› Issue (11A): 250200029-9.doi: 10.11896/jsjkx.250200029

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于生成模型的学生在线学习表现预测混合方法研究

段超, 王一晴, 王洁, 张明焱   

  1. 浙江师范大学浙江省智能教育技术与应用重点实验室 浙江 金华 321004
  • 出版日期:2025-11-15 发布日期:2025-11-10
  • 通讯作者: 张明焱(mingyanzhang@zjnu.edu.cn)
  • 作者简介:duanchao@zjnu.edu.cn
  • 基金资助:
    国家自然科学基金(62207027,62177024);教育部产学合作协同育人项目(220906424035704);浙江省教育科学规划课题(2023SCG369)

Research on Hybrid Methods for Predicting Students’ Online Learning Performance Based on Generative Model

DUAN Chao, WANG Yiqing, WANG Jie, ZHANG Mingyan   

  1. Key Laboratory of Intelligent Education Technology,Application of Zhejiang Province,Zhejiang Normal University,Jinhua,Zhejiang 321004,China
  • Online:2025-11-15 Published:2025-11-10
  • Supported by:
    National Natural Science Foundation of China(62207027,62177024),University-Industry Collaborative Education Program(220906424035704) and Zhejiang Province Educational Science and Planning Research Project(2023SCG369).

摘要: 学习表现预测利用在线学习平台的学生学习行为数据来识别存在学业风险的学生,可以帮助教师及时进行干预,然而该方式面临着数据不平衡问题,这使得准确识别存在学业风险的学生尤为困难。针对当前解决策略中变分自编码器(Variational Autoencoder,VAE)不能保证生成样本的合理性,生成对抗网络(Generative Adversarial Network,GAN)在处理时间序列数据时易引入新的错误,并且生成器和判别器任何一方训练得过于出色或不足都会导致生成数据质量下降等问题。提出了一种新的基于生成模型的学生在线学习表现预测混合方法。具体而言,首先利用融合双向长短期记忆网络(Bidirectional Long Short-term Memory,BiLSTM)的VAE对GAN进行初始化,不仅能从更加稳定的点开始训练,而且能更好地理解学生行为序列数据前后之间的关联关系和周期性特征;其次,判别器中引入多头注意力机制,增强其对真实数据和生成数据的区分能力,进而与生成器不断博弈;最后,将深度生成模型与经典重采样策略(Synthetic Minority Oversampling Technique,SMOTE)基于Blending集成学习的思想进行融合,有效结合数据和算法两个方面的优势,提高了模型整体的生成能力。在两个真实学生数据集上进行了大量实验,结果表明,该模型可以生成高质量的数据,从而提升预测模型对存在学业风险学生的识别能力,从第一单元开始,在4个评价指标上较基线方法均有提升。

关键词: 学习表现预测, 生成模型, 双向长短期记忆网络, 注意力机制, 集成方法

Abstract: Learning performance prediction can help teachers to intervene in time by using the learning behavior data of students on the online learning platform to identify at-risk students,but it faces the problem of data imbalance,which makes it particularly difficult to accurately identify at-risk students.Addressing the issues that the mainstream deep generative model VAE,cannot guarantee the rationality of generated samples in current solution strategies,and GAN tends to introduce new errors when processing time-series data,with either over-training or under-training of the generator or discriminator leading to a decline in the quality of generated data,this paper proposes a new prediction method for student learning performance based on generative mo-dels.Firstly,the VAE based on bidirectional long short-term memory(BiLSTM) is utilized to initialize the GAN,enabling it to start training from a more stable point while also better understanding the correlation and periodic characteristics between subsequent data points in the student behavior sequence.Secondly,a multi-head attention mechanism is introduced in the discriminator part to enhance its ability to distinguish between real data and generated data,and then continue to game with the generator.Finally,the deep generative model and the classical resampling strategy SMOTE are integrated based on the idea of Blending ensemble learning,which effectively combines the advantages of data and algorithm to improve the overall generation ability of the model.A large number of experimental results on real student data sets show that the model can generate high-quality data to improve the recognition ability of the prediction model for at-risk students,and is superior to the baseline method in multiple evaluation indicators.

Key words: Learning performance prediction, Generative model, Bidirectional long short-term memory, Attention mechanism, Ensemble method

中图分类号: 

  • TP391
[1]SCHELL J,LUKOFF B,ALVARADO C.Using early warning signs to predict academic risk in interactive,blended teaching environments[J].Internet Learning,2014,3(2):6.
[2]PATIL P,GANESAN K,KANAVALLI A.Effective deeplearning model to predict student grade point averages[C]//IEEE International Conference on Computational Intelligence and Computing Research(ICCIC).Coimbatore,India,2017:1-6.
[3]KIM B H,VIZITEI E,GANAPATHI V.GritNet:student performance prediction with deep learning[J]. arXiv:1804.07405,2018.
[4]YAO L,CUI C R,MA L L,et al.Student performance predictionbase on campus online behavior-aware[J].Journal of Computer Research and Development,2022,59(8):1770-1781.
[5]CHAWLAN V,BOWYER K W,HALL L O,et al.SMOTE:synthetic minority over-sampling technique[J].Journal of Artificial Intelligence Research,2002,16:321-357.
[6]DU X,YANG J,LI H.An integrated framework based on latent variational autoencoder for providing early warning of at-risk students[J].IEEE Access,2020,8(99):10110-10122.
[7]SARWAT S,ULLAH N,SADIQ S,et al.Predicting students’ academic performance with conditional generative adversarial network and deep SVM[J].Sensors,2022,22(13):4834.
[8]ZHENG Y F,ZHENG S,DENG M M,et al.MOOC dropoutprediction using a fusion deep model based on behaviour features[J].Computers and Electrical Engineering,2022,104:108409.
[9]ZHANG M Y,DU X,LI H.Research on early warning forlearning performance combined with students’ behavior patterns analysis[J].Computer Engineering and Applications,2022,58(1):99-105.
[10]CHEN J,WEI G L,LIU J X,et al.A prediction model of student performance based on self-attention mechanism[J].Knowledge and Information Systems,2023,65:733-758.
[11]HUANG C,LIU G,JIANG W,et al.Learning Pattern Recognition and Performance Prediction Method Based on Learners’Behavior Evolution [J].Computer Science,2024,51(10):67-78.
[12]WANG S,NI L,ZHANG Z,et al.Multimodal prediction ofstudent performance:A fusion of signed graph neural networks and large language models[J].Pattern Recognition Letters,2024,181:1-8.
[13]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[14]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[J].Advances in Neural Information Processing Systems,2014,27(2):2672-2680.
[15]RACHBUREE N,PUNLUMJEAK W.Oversampling technique in student performance classification from engineering course[J].International Journal of Electrical and Computer Enginee-ring,2021,11(4):3567-3574.
[16]RAHIM AA,BUNIYAMIN N.Mitigating imbalanced classification problems in academic performance with resampling methods[J].Journal of Electrical and Electronic Systems Research,2023,23:1985-5389.
[17]ZHANG Y,LU M.Based on graph-VAE model to predictstudent’s score[J].arXiv:1903.03609,2019.
[18]CHUI K T,LIU R W,ZHAO M,et al.Predicting students’ performance with school and family tutoring using generative adversarial network-based deep support vector machine[J].IEEE Access,2020,8:86745-86752.
[19]SATHIYAPRIYA S,KANAGARAJ A.Student performanceprediction using modified chicken swarm optimization and improved conditional generative adversarial network -with parallel support vector machine over educational data[C]//2021 International Conference on Advancements in Electrical,Electronics,Communication,Computing and Automation(ICAECA).Coimbatore,India,2021:1-7.
[20]WAHEED H,ANAS M,HASSAN S U,et al.Balancing sequential data to predict students at-risk using adversarial networks[J].Computers and Electrical Engineering,2021,93:107274.
[21]LENIN T,CHANDRASEKARAN N.Learning from imba-lanced educational data using ensemble machine learning algorithms[J].Special Issue on Artificial Intelligence in Cloud Computing,2021,18:183-195.
[22]RABELO A M,ZÁRATE L E.A model for predicting dropout of higher education students[J].Data Science and Management,2025,8(1):72-85.
[23]FAN Z,WANG Y,MENG L,et al.Unsupervised anomaly detection method for bearing based on VAE-GAN and time-series data correlation enhancement[J].IEEE Sensors Journal,2023,23(23):29345-29356.
[24]JUSTIN L.VRNNGAN:A recurrent VAE-GAN frame workfor synthetic time-series[D].Toronto:Department of Faculty of Information,University of Toronto,2022.
[25]NIU Z,YU K,WU X.LSTM-Based VAE-GAN for Time-Series Anomaly Detection[J].Sensors,2020,20(13):3738.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!