基于深度强化学习与程序分析的OJ习题推荐模型

doi:10.11896/jsjkx.220600260

计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 58-67.doi: 10.11896/jsjkx.220600260

• 数据库&大数据&数据科学 • 上一篇下一篇

基于深度强化学习与程序分析的OJ习题推荐模型

金天成^1,2, 窦亮², 张伟^1,2, 肖春芸², 刘峰^1,2, 周爱民^1,2

1 华东师范大学上海智能教育研究院上海 200062
2 华东师范大学计算机科学与技术学院上海 200062

收稿日期:2022-06-28 修回日期:2022-11-05 出版日期:2023-08-15 发布日期:2023-08-02
通讯作者: 窦亮(ldou@cs.ecnu.edu.cn)
作者简介:(52205901026@stu.ecnu.edu.cn)
基金资助:
国家自然科学基金青年科学基金(61907015);上海市科学技术委员会高新技术领域项目(20511102502)

OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis

JIN Tiancheng^1,2, DOU Liang², ZHANG Wei^1,2, XIAO Chunyun², LIU Feng^1,2, ZHOU Aimin^1,2

1 Shanghai Institute of AI for Education,East China Normal University,Shanghai 200062,China
2 School of Computer Science and Technology,East China Normal University,Shanghai 200062,China

Received:2022-06-28 Revised:2022-11-05 Online:2023-08-15 Published:2023-08-02
About author:JIN Tiancheng,born in 1995,Ph.D candidate.His main research interests include educational data mining,recommender system and program analysis.
DOU Liang,born in 1980,Ph.D,asso-ciate professor,is a member of China Computer Federation.Her main research interests include software me-thods and AI for education.
Supported by:
Young Scientists Fund of the National Natural Science Foundation of China(61907015) and Shanghai Committee of Science and Technology,China(20511102502).

摘要/Abstract

摘要： 当前Online Judge系统(简称OJ)上存有大量习题,导致学生很难根据自己的知识水平和学习需求快速地找到合适的习题,因此需要设计模型向学生推荐习题。然而,由于OJ的独特性以及程序设计能力评价的复杂性,现有推荐模型不能较好地完成OJ习题推荐任务,主要问题包括:OJ习题知识点标签不足与特有的命题风格使模型难以挖掘习题之间的相关性;学生所提交程序的实际正确性与OJ判定结果存在不一致的情况,使得模型对学生知识状态的评估产生偏差;现有模型较难提供可使学生程序设计能力得到显著增长的习题。据此,提出了一种基于深度强化学习与程序分析的OJ习题推荐模型。首先,分析习题的最优解来挖掘习题之间的相关性;然后,比较学生所提交程序与习题最优解的相似性来检验学生所提交程序的实际正确性,使模型能够更准确地估计学生的知识状态;最后,利用深度强化学习技术并使用知识追踪模型作为学生模拟器,以学生模拟器在解答习题推荐模型所提供的习题前后在所有习题上的表现差异作为奖励,使模型学习到怎样的习题才能够最大程度地提升学生程序设计能力,并将这样的习题推荐给学生。在业界知名OJ系统CodeForces和Libre数据集上进行实验,结果表明该模型相比目前常见的推荐模型具有更优的性能。

关键词: 推荐系统, 深度强化学习, 程序分析, 知识追踪, 在线判题

Abstract: At present,there are a large number of exercises on the existing programming Online Judge systems(OJ),which makes it difficult for students to quickly find suitable exercises according to their own knowledge level and learning demand.Therefore,it is necessary to design a model to recommend suitable exercises to students.However,due to uniqueness of OJ and complexity of programming ability evaluation,existing recommendation model can not complete OJ exercise recommendation task well,the main problems include:OJ exercises' lack of knowledge label and unique proposition style make it difficult for existing models to mine correlation between exercises; actual correctness of the program submitted by student is inconsistent with OJ judgement result,which leads to deviation of students' knowledge state estimated by models; existing models are difficult to provide exercises that increase students' programming ability most significantly.Based on this,this paper proposes an OJ exercise recommendation model based on deep reinforcement learning and program analysis.Firstly,analyzing optimal solution of exercises to mine correlations between exercises.Then,comparing the similarity between programs submitted by students and optimal solution of exercises to check actual correctness of the programs submitted by students,so that knowledge state of students can be estimated more accurately.Finally,using deep reinforcement learning technology,taking knowledge tracking model as student simulator and treating student simulator's performance difference on all the exercises before and after answering exercises provided by exercise re-commendation model as reward,so that exercise recommendation model can learn which exercise is able to improve the students' programming ability to the greatest extent,and recommend such exercises to students.This paper conducts extensive experiments on two datasets CodeForces and Libre of the well-known OJ system,and experimental results show that the proposed model can achieve higher performance than the state-of-the-art recommendation models.

Key words: Recommender system, Deep reinforcement learning, Program analysis, Knowledge tracing, Online judge

中图分类号:

TP301

金天成, 窦亮, 张伟, 肖春芸, 刘峰, 周爱民. 基于深度强化学习与程序分析的OJ习题推荐模型[J]. 计算机科学, 2023, 50(8): 58-67. https://doi.org/10.11896/jsjkx.220600260

JIN Tiancheng, DOU Liang, ZHANG Wei, XIAO Chunyun, LIU Feng, ZHOU Aimin. OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis[J]. Computer Science, 2023, 50(8): 58-67. https://doi.org/10.11896/jsjkx.220600260

参考文献

[1]GONG J,WANG S,WANG J,et al.Attentional graph convolutional networks for knowledge concept recommendation in moocs in a heterogeneous view[C]//Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval.Xi'an:ACM,2020:79-88.
[2]SHAO E,GUO S,PARDOS Z A.Degree Planning with PLAN-BERT:Multi-Semester Recommendation Using Future Courses of Interest[C]//Proceedings of the AAAI Conference on Artificial Intelligence.Vancouver:AAAI,2021:14920-14929.
[3]HUO Y,WONG D F,NI L M,et al.Knowledge modeling via contextualized representations for LSTM-based personalized exercise recommendation[J].Information Sciences,2020,523:266-278.
[4]ZHAO J,BHATT S,THILLE C,et al.Interpretable persona-lized knowledge tracing and next learning activity recommendation[C]//Proceedings of the Seventh ACM Conference on Learning@ Scale.New York:ACM,2020:325-328.
[5]KANG W,ZHANG L,LI B,et al.Personalized exercise recommendation via implicit skills[C]//Proceedings of the ACM Tu-ring Celebration Conference－China.Chengdu:ACM,2019:1-6.
[6]MA X R,XU Y,ZHU Q X.Personalized Exercises Recommendation Method Based on Deep Knowledge Tracing[J].Journal of Chinese Computer Systems,2020,41(5):990-995.
[7]XIONG H J,SONG Y F,ZHANG P,et al.Personalized Question Recommendation Based on Autoencoder and Two-step Collaborative Filtering[J].Computer Science,2019,46(11A):172-177.
[8]QI B,ZOU H X,WANG Y,et al.Questions RecommendationBased on Collaborative Filtering and Cognitive Diagnosis[J].Computer Science,2019,46(11):235-240.
[9]ZHU T Y,HUANG Z Y,CHEN E H,et al.Cognitive DiagnosisBased Personalized Question Recommendation[J].Chinese Journal of Computers,2017,41(1):176-191.
[10]FAN X.Item response theory and classical test theory:An empirical comparison of their item/person statistics[J].Educational and Psychological Measurement,1998,58(3):357-381.
[11]DE LA TORRE J.DINA model and parameter estimation:A didactic[J].Journal of Educational and Behavioral Statistics,2009,34(1):115-130.
[12]GAO W,LIU Q,HUANG Z,et al.Rcd:Relation map driven cognitive diagnosis for intelligent education systems[C]//Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval.Montreal:ACM,2021:501-510.
[13]CORBETT A T,ANDERSON J R.Knowledge tracing:Mode-ling the acquisition of procedural knowledge[J].User Modeling and User-adapted Interaction,1994,4(4):253-278.
[14]PIECH C,BASSEN J,HUANG J,et al.Deep knowledge tracing[C]//Proceedings of the Advances in Neural Information Processing Systems.Quebec:MIT Press,2015:505-513.
[15]ZHANG J,SHI X,KING I,et al.Dynamic key-value memory networks for knowledge tracing[C]//Proceedings of the 26th International Conference on World Wide Web.Perth:ACM,2017:765-774.
[16]CHOI Y,LEE Y,CHO J,et al.Towards an appropriate query,key,and value computation for knowledge tracing[C]//Procee-dings of the Seventh ACM Conference on Learning@ Scale.New York:ACM,2020:341-344.
[17]Renmin University of China.Libre OJ[EB/OL].(2022-03-25) [2022-05-11].https://loj.ac.
[18]WANG C Q.Luogu OJ[EB/OL].(2022-05-11) [2022-05-11].https://www.luogu.com.cn.
[19]LIU Q,HUANG Z,HUANG Z Y,et al.Finding similar exercises in online education systems[C]//Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Disco-very & Data Mining.London:ACM,2018:1821-1830.
[20]TAN D W,ZHANG K F,LIU Y,et al.Program Classification Using Gated Graph Attention Neural Network[J].Computer Engineering and Applications,2020,56(7):176-183.
[21]HUANG Z,LIU Q,ZHAI C,et al.Exploring multi-objectiveexercise recommendations in online education systems[C]//Proceedings of the 28th ACM International Conference on Information and Knowledge Management.Beijing:ACM,2019:1261-1270.
[22]ZHANG J,HAO B,CHEN B,et al.Hierarchical reinforcement learning for course recommendation in MOOCs[C]//Procee-dings of the AAAI Conference on Artificial Intelligence.Hawaii:AAAI,2019:435-442.
[23]LIN Y,LIN F,ZENG W,et al.Hierarchical reinforcement lear-ning with dynamic recurrent mechanism for course recommendation[J].Knowledge-Based Systems,2022,244:108546.
[24]LIU Q,TONG S,LIU C,et al.Exploiting cognitive structure for adaptive learning[C]//Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mi-ning.Alaska:ACM,2019:627-635.
[25]ALLAMANIS M,PENG H,SUTTON C.A convolutionalattention network for extreme summarization of source code[C]//International Conference on Machine Learning.New York:JMLR,2016:2091-2100.
[26]LU Y,LI G,ZHAO Z,et al.Learning to infer API mappings from API documents[C]//International Conference on Know-ledge Science,Engineering and Management.Melbourne:Sprin-ger,2017:237-248.
[27]MOU L,LI G,ZHANG L,et al.Convolutional neural networks over tree structures for programming language processing[C]//Thirtieth AAAI Conference on Artificial Intelligence.Arizona:AAAI,2016:1287-1293.
[28]LI Y,TARLOW D,BROCKSCHMIDT M,et al.Gated graphsequence neural networks[C]//4th International Conference on Learning Representations.Puerto Rico:OpenReview,2016.
[29]Saratov State University.CodeForces[EB/OL].(2022-05-11)[2022-05-11].https://codeforces.com.
[30]Clemson University.United States of America ComputingOlympiad[EB/OL].(2022-05-11) [2022-05-11].http://www.usaco.org.
[31]Peking University.Peking University OJ[EB/OL].(2022-05-11) [2022-05-11].http://poj.org.
[32]Zhejiang University.Zhejiang University OJ[EB/OL].(2022-05-11) [2022-05-11].http://acm.zju.edu.cn/onlinejudge.
[33]Hangzhou Dianzi University.Hangzhou Dianzi University OJ[EB/OL].(2022-05-11) [2022-05-11].http://acm.hdu.edu.cn.
[34]10XRECRUIT.Kattis Problem Archive[EB/OL].(2022-05-11) [2022-05-11].https://open.kattis.com.
[35]ZHENG W,DU Q,FAN Y,et al.A personalized programming exercise recommendation algorithm based on knowledge structure tree[J].Journal of Intelligent & Fuzzy Systems,2022,42(3):2169-2180.
[36]YERA TOLEDO R,CABALLERO MOTA Y,MARTINEZ L.A recommender system for programming online judges using fuzzy information modeling[J].Informatics,2018,5(2):1-17.
[37]YERA R,MARTINEZ L.A recommendation approach for programming online judges supported by data preprocessing techniques[J].Applied Intelligence,2017,47(2):277-290.
[38]TOLEDO R Y,MOTA Y C.An e-learning collaborative filtering approach to suggest problems to solve in programming online judges[J].International Journal of Distance Education Techno-logies(IJDET),2014,12(2):51-65.
[39]MAO H Y.A Study on Knowledge Recommender AlgorithmBased on Time Transition[D].Tianjin:Tianjin University,2017.
[40]NANDPAWAN D.KalkiCode[EB/OL].(2022-05-11) [2022-05-11].https://kalkicode.com/ai/java-to-cplusplus-converter-online.
[41]LUKAS M,TIMOTHY C.Python to C++ 14 transpiler[EB/OL].(2018-06-03) [2022-05-11].https://github.com/luk-asmartinelli/py14.
[42]MNIH V,KAVUKCUOGLU K,SILVER D,et al.Playing atari with deep reinforcement learning[J].arXiv:1312.5602,2013.
[43]HIDASI B,KARATZOGLOU A,BALTRUNAS L,et al.Session-based recommendations with recurrent neural networks[C]//Proceedings of 4th International Conference on Learning Representations.Puerto Rico:OpenReview,2016:1-10.
[44]KANG W C,MCAULEY J.Self-attentive sequential recommendation[C]//2018 IEEE International Conference on Data Mi-ning(ICDM).Singapore:IEEE Computer Society,2018:197-206.
[45]LI J,WANG Y,MCAULEY J.Time interval aware self-attention for sequential recommendation[C]//Proceedings of the 13th International Conference on Web Search and Data Mining.Texas:ACM,2020:322-330.
[46]HE Z,ZHAO H,LIN Z,et al.Locker:Locally Constrained Self-Attentive Sequential Recommendation[C]//Proceedings of the 30th ACM International Conference on Information & Know-ledge Management.Queensland:ACM,2021:3088-3092.
[47]TANG J,CHEN Y,ZHOU M,et al.A Survey of Studies on Deep Learning Applications in POI Recommendation[J].Computer Engineering,2022,48(1):12-23,42.
[48]YANG G,QIAO S,QU L.A survey of learning based database index advisor technology[J].Journal of Chongqing University of Technology(Natural Science),2022,36(6):189-199.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于深度强化学习与程序分析的OJ习题推荐模型

OJ Exercise Recommendation Model Based on Deep Reinforcement Learning and Program Analysis

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0