计算机科学 ›› 2023, Vol. 50 ›› Issue (11): 241-247.doi: 10.11896/jsjkx.221100169

• 人工智能 • 上一篇    下一篇

基于注意力机制的概念增强认知诊断模型

苑冬雪, 孙权森, 傅鹏   

  1. 南京理工大学计算机科学与工程学院 南京 210000
  • 收稿日期:2022-11-21 修回日期:2023-03-10 出版日期:2023-11-15 发布日期:2023-11-06
  • 通讯作者: 孙权森(sunquansen@njust.edu.cn)
  • 作者简介:(dxyuan@njust.edu.cn)

Attention Based Concept Enhanced Cognitive Diagnosis

YUAN Dongxue, SUN Quansen, FU Peng   

  1. School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210000,China
  • Received:2022-11-21 Revised:2023-03-10 Online:2023-11-15 Published:2023-11-06
  • About author:YUAN Dongxue,born in 1996,postgra-duate.Her main research interests include educational data mining and so on.SUN Quansen,born in 1963,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include image recognition and computer vision.

摘要: 认知诊断是智能教育系统中的一个基础问题,旨在评估学生对不同知识概念的掌握程度。虽然目前基于深度学习的认知诊断方法相比传统方法有了较大改进,但是其无法充分利用概念之间的潜在相关性。为此,提出一种基于注意力机制的概念增强认知诊断模型(ACECD),通过建模相关概念之间的关系来获得更准确的认知诊断结果。具体来说,首先将学生、练习和概念投影到因子向量来执行复杂交互;然后把概念因子输入自注意力网络中捕获概念之间存在的隐式相关性关系,并用捕获到的隐式关系增强概念因子向量;最后把增强过的概念因子与学生因子和练习因子进行交互,将交互结果输入诊断模块得到最终诊断结果。此外,利用练习因子与概念因子之间的交互修正人为标定Q矩阵的误差。在两个真实世界数据集上与其他方法进行比较,实验结果表明基于注意力机制的概念增强认知诊断模型有效地改善了诊断结果。

关键词: 注意力, 认知诊断, 神经网络

Abstract: Cognitive diagnosis is a fundamental problem in intelligent education systems,which aims to evaluate the mastery le-vels of students on different knowledge concepts.Although the performance current deep learning-based cognitive diagnostic me-thods has improved greatly compared with traditional methods,they cannot fully exploit the potential correlation between concepts.To this end,this paper proposes an attention-based concept enhanced cognitive diagnosis(ACECD) model to obtain more accurate cognitive diagnostic results by modeling the relationship between related concepts.Specifically,we first project students,exercises,and concepts to factor vectors to perform complex interactions,and then feed the concept factors into a self-attention network to capture the implicit correlations that exist between concepts,and concept factor vector can be enhanced with the captured implicit relation.Finally,the enhanced concept factors are interacted with the student factor and the practice factor,and the interacted results are input into the diagnosis module to get the final diagnosis result.In addition,we also use the interaction between the practice factor and the concept factor to correct the bias of the manually-labeled Q matrix.The proposed model is compared with other methods on two real-world datasets,and the experimental results show that the ACECD model effectively improves the diagnostic results.

Key words: Attention, Cognitive diagnosis, Neural network

中图分类号: 

  • TP391.4
[1]CARBONELL J R.AI in CAI:An artificial-intelligence approachto computer-assisted instruction[J].IEEE Transactions on Man Machine Systems,1971,11(4):190-202.
[2] LIU Q,WU R,CHEN E,et al.Fuzzy cognitive diagnosis for modelling examinee performance[J].ACM Transactions on Intelligent Systems and Technology,2018,9(4):1-26.
[3]TANG X,CHEN Y,LI X,et al.A reinforcement learning approach to personalized learning recommendation systems[J].British Journal of Mathematical and Statistical Psychology,2019,72(1):108-135.
[4]ZHOU Y,HUANG C,HU Q,et al.Personalized learning full-path recommendation model based on LSTM neural networks[J].Information Sciences,2018,444:135-152.
[5] LIU Q,TONG S,LIU C,et al.Exploiting cognitive structure for adaptive learning[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.2019:627-635.
[6]ELLIOTT M,PINAR W F,REYNOLDS W M,et al.Understanding curriculum:An introduction to the study of historical and contemporary curriculum discourses[J].Brock Education A Journal of Educational Research and Exercise,2010,13(1):100.
[7]EMBRETSON S E,REISE S P.Item response theory[M].Psychology Press,2013.
[8]ACKERMAN T A,GIERL M J,WALKER C M.Using multidi-mensional item response theory to evaluate educational and psychological tests[J].Educational Measurement:Issues and Practice,2003,22(3):37-51.
[9]RECKASE M D.Multidimensional item response theory models[M].New York:Springer, 2009:79-112.
[10]DE LA TORRE J.DINA model and parameter estimation:A didactic[J].Journal of Educational and Behavioral Statistics,2009,34(1):115-130.
[11] KOREN Y,BELL R,VOLINSKY C.Matrix factorization techniques for recommender systems[J].Computer,2009,42(8):30-37.
[12]WANG F,LIU Q,CHEN E,et al.Neural cognitive diagnosis for intelligent education systems[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:6153-6161.
[13]ELLIS H C.The transfer of learning[M].Macmillan,1965.
[14]WOODWORTH R S,THORNDIKE E L.The influence of improvement in one mental function upon the efficiency of other functions[J].Psychological Review,1901,8(3):247.
[15]VON DAVIER M.The DINA model as a constrained generaldiagnostic model:Two variants of a model equivalency[J].British Journal of Mathematical and Statistical Psychology,2014,67(1):49-71.
[16] LAWRENCE S,GILES C L,TSOI A C,et al.Face recognition:A convolutional neural-network approach[J].IEEE Transactions on Neural Networks,1997,8(1):98-113.
[17]CHAN W,JAITLY N,LE Q,et al.Listen,attend and spell:A neural network for large vocabulary conversational speech re-cognition[C]//2016 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2016:4960-4964.
[18]BOJARSKI M,DEL TESTA D,DWORAKOWSKI D,et al.End to end learning for self-driving cars[J].arXiv:1604.07316,2016.
[19]PIECH C,BASSEN J,HUANG J,et al.Deep knowledge tracing[J].Advances in Neural Information Processing Systems,2015,28:1-9.
[20]WILLIAMS R J,ZIPSER D.A learning algorithm for continually running fully recurrent neural networks[J].Neural Computation,1998,1(2):270-280.
[21] TSUTSUMI E,KINOSHITA R,UENO M.Deep-IRT with independent student and item networks[J].International Educational Data Mining Society,2021:510-517.
[22]TONG S,LIU J,HONG Y,et al.Incremental cognitive diagnosis for intelligent education[C]//Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.2022:1760-1770.
[23] HUANG J,LIU Q,WANG F,et al.Group-level cognitive diag-nosis:A multi-task learning perspective[C]//2021 IEEE International Conference on Data Mining.IEEE,2021:210-219.
[24]ZHOU Y,LIU Q,WU J,et al.Modeling context-aware features for cognitive diagnosis in student learning[C]//Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining.2021:2420-2428.
[25]LAROCHELLE H,HINTON G E.Learning to combine foveal glimpses with a third-order boltzmannmachine[J].Advances in Neural Information Processing Systems,2010,23:1-9.
[26] MNIH V,HEESS N,GRAVES A.Recurrent models of visualattention[J].Advances in Neural Information Processing Systems,2014,27:1-9.
[27]BAHDANAU D,CHO K,BENGIO Y.Neural machine translation by jointly learning to align and translate[C]//International Conference on Learning Representations.2014.
[28] XU K,BA J,KIROS R,et al.Show,attend,and tell:Neuralimage caption generation with visual attention[C]//InternationalConference on Machine Learning.PMLR,2015:2048-2057.
[29] YANG Z,YANG D,DYER C,et al.Hierarchical attention networks for document classification[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489.
[30]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[J].Advances in Neural Information Processing Systems,2017,30:1-11.
[31]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186.
[32] FENG M,HEFFERNAN N,KOEDINGER K.Addressing theassessment challenge with an online system that tutors as it assesses[J].User Modeling and User-Adapted Interaction,2009,19(3):243-266.
[33] PANDEY S,SRIVASTAVA J.RKT:Relation-aware self-attention for knowledge tracing[C]//Proceedings of the 29th ACM International Conference on Information & Knowledge Management.2020:1205-1214.
[34]PEI H,YANG B,LIU J,et al.Group sparse bayesian learning for active surveillance on epidemic dynamics[C]//Thirty-Second AAAI Conference on Artificial Intelligence.2017.
[35]BRADLEY A P.The use of the area under the ROC curve in the evaluation of machine learning algorithms[J].Pattern Recognition,1997,30(7):1145-1159.
[36] GLOROT X,BENGIO Y.Understanding the difficulty of trai-ning deep feedforward neural networks[C]//Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.2010:249-256.
[37] KINGMA D,BA J.Adam:A method for stochastic optimization[J].arXiv,1412.6980,2014.
[38]FOUSS F,PIROTTE A,RENDERS J M,et al.Random-walkcomputation of similarities between nodes of a graph with application to collaborative recommendation[J].IEEE Transactions on Knowledge and Data Engineering,2007,19(3):355-369.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!