计算机科学 ›› 2024, Vol. 51 ›› Issue (10): 33-39.doi: 10.11896/jsjkx.240400008

• 智能教育技术及应用 • 上一篇    下一篇

主观题自动评判算法研究综述

冯筠, 栗凯旋, 高志泽樟, 黄立, 孙霞   

  1. 西北大学信息科学与技术学院 西安 710000
  • 收稿日期:2024-04-01 修回日期:2024-07-12 出版日期:2024-10-15 发布日期:2024-10-11
  • 通讯作者: 孙霞(raindy@nwu.edu.cn)
  • 作者简介:(fengjun@nwu.edu.cn)
  • 基金资助:
    陕西省科技厅重点产业链项目(2019ZDLGY03-10)

Survey of Research on Automated Grading Algorithms for Subjective Questions

FENG Jun, LI Kaixuan, GAO Zhizezhang, HUANG Li, SUN Xia   

  1. School of Information Science and Technology,Northwest University,Xi'an 710000,China
  • Received:2024-04-01 Revised:2024-07-12 Online:2024-10-15 Published:2024-10-11
  • About author:FENG Jun,born in 1972,Ph.D,professor,Ph.D supervisor,is an advanced member of CCF(No.10834S).Her mainresearch interests include intelligent information processing and so on.
    SUN Xia,born in 1977,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.E200015067M).Her main research interests include intelligent education and natural language processing.
  • Supported by:
    Key Industry Chain Project of Department of Science and Technology of Shaanxi Province(2019ZDLGY03-10).

摘要: 在教育教学中,试卷评判是教师获取学生知识点掌握情况的重要途径。然而,试题评分是一个耗时的过程,主观题的评判更需要阅卷人认真、投入、细致地审阅,需要耗费大量精力。要减轻教师工作压力,提高主观题评判的效率,基于人工智能的自动评判技术非常重要,其中主观题的自动评判是难点。随着机器学习和深度学习等技术在自然语言处理领域的发展,主观题自动评判技术有了较大进展。文中将主观题分为常规型和开放型两类进行文献梳理,总结主观题自动评价的标准和公开数据集,归纳涉及的方法和技术路线,并对主观题自动评判技术未来的研究方向进行总结和展望。

关键词: 自动阅卷, 主观题, 自然语言处理, 深度学习, 智能教育

Abstract: In educational teaching,paper assessment is an important means for teachers to understand students' grasp of know-ledge points.However,grading exam questions is a time-consuming process,and assessing subjective questions requires examiners to review the work carefully,with engagement and attention to detail,often consuming a significant amount of energy.To reduce the workload on teachers and improve the efficiency of subjective question assessment,research on AI-based automatic grading techniques is imperative,with subjective question evaluation posing a particular challenge.With advancements in machine learning and deep learning in the field of natural language processing,significant progress has been made in the automation of subjective question assessment.This paper categorizes subjective questions into conventional and open-ended types,respectively,conducts a literature review,summarizes evaluation criteria and publicly available datasets,and outlines methods and technological approaches involved.Finally,the future research directions of automatic evaluation of subjective questions is summarized and prospected.

Key words: Automated marking of exam papers, Subjective question, Natural language processing, Deep learning, Intelligent education

中图分类号: 

  • TP181
[1]CALLEAR D H,JERRAMS-SMITH J,SOH V.CAA of short non-McQ answers[C]//Proceedings of the 5th International Computer Assisted Assessment Conference.Loughborough:Loughborough University,2001:1-14.
[2]MITCHELL T,RUSSEL T,BROOMHEAD P,et al.Towardsrobust computerised marking of free-text responses[EB/OL].https://api.semanticscholar.org/CorpusID:17936736.
[3]LANDAUER T K,LAHAM D,FOLTZ P W.Automated essay scoring and annotation of essays with the intelligent essay assessor [M]//Automated Essay Scoring.2003:87-112.
[4]LAWRENCE M R,VERONICA G,CATHERINE W.An Eva-luation of IntelliMetric Essay Scoring System[J].The Journal of Technology,Learning,and Assessment,2006,4(4):1-22.
[5]LANDAUER T K.Automatic Essay Assessment[J].Assess-ment in Education:Principles,Policy & Practice,2003,10(3):295-308.
[6]ATTALI Y,BURSTEIN J.Automated essay scoring with e-ra-ter© V.2[J].The Journal of Technology,Learning and Assessment,2006,4(3):3-30.
[7]关注!语文作文智能批改来了![EB/OL].(2021-04-07)[2024-03-18].https://mp.weixin.qq.com/s/K5A-uyc0iIP7B9c4ZCootA.
[8]ZHENG C J,GUO S Y,XIA W,et al.Chinese/English Journal of Educational Measurement and Evaluation,2023,4(3):4.
[9]YUAN Y L.A study of automated assessment of subjectivequestions in high school mathematics[D].Tianjin:Tianjin Normal University,2022.
[10]PRABHUDESAIA,DUONG TN B.Automatic short answer grading using Siamese bidirectional LSTM based regression [C]//2019 IEEE International Conference on Engineering,Technology and Education(TALE).IEEE,2019:1-6.
[11]ZHANG Z W.A Study of Automatic Scoring Methods and Selection of Textual Features for Subjective Questions in History Examination Papers[D].Jiangxi:Jiangxi University of Finance and Economics,2021.
[12]YANG G Y.Research and Implementation of Automatic Scoring Model for Geography Subjective Questions[D].Beijing:Beijing University of Technology,2021.
[13]MARCUS M,NEIL C C B,MICHAEL K,et al.AutomatedGrading and Feedback Tools for Programming Education:A Systematic Review[J].ACM Transactions on Computing Education 2024,24(1):1-43.
[14]SAGAR P,ZIYAAN D,PRAVEEN K S,et al.Automatic Gra-ding and Feedback using Program Repair for Introductory Programming Courses[C]//Proceedings of the 2017ACM Confe-rence on Innovation and Technology in Computer Science Education(ITiCSE '17).Association for Computing Machinery,New York,NY,USA,2017(6):92-97.
[15]YIANNIS K,PANAGIOTIS A,DIMITRI A,et al.Code quality evaluation methodology using the ISO/IEC 9126 standard[J].International Journal of Software Engineering and Applications,2010,1(3):17-36.
[16]KEUNING H,HEEREN B,JEURING J.Code Quality Issues in Student Programs[C]//Proceedings of the 2017 ACM Confe-rence on Innovation and Technology in Computer Science Education(ITiCSE'17).Association for Computing Machinery,Bologna,Italy,2017:110-115.
[17]BU J,REN L,ZHENG S,et al.ASAP:A Chinese Review Dataset Towards Aspect Category Sentiment Analysis and Rating Prediction[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2021:2069-2079.
[18]SANDEEP M,PUSHPAK B.ASAP++:Enriching the ASAP Automated Essay Grading Dataset with Essay Attribute Scores[C]//Proceedings of the Eleventh International Conference on Language Resources and Evaluation(LREC 2018).2018.
[19]MICHAEL F,MICHAEL F,ALLA R.A Benchmark Corpus ofEnglish Misspellings and a Minimally-supervised Model for Spelling Correction[C]//Proceedings of the Fourteenth Workshop on Innovative Use of NLP for Building Educational Applications.2019:76-86.
[20]SYLVIANE G,MAÏTÉ D,FANNY M,et al.The International Corpus of Learner English.Version 3.Louvainla-Neuve:Presses universitaires de Louvain[EB/OL].https://dial.uclouvain.be/pr/boreal/object/boreal:229877.
[21]MAREK R,HELEN Y.Compositional Sequence Labeling Mo-dels for Error Detection in Learner Writing[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics.2016:1181-1191.
[22]MICHAEL M,RAZVAN B,RADA M.Learning to grade shortanswer questions using semantic similarity measures and dependency graph alignments[C]//Proceedings of the 49th An-nual Meeting of the Association for Computational Linguistics:Human Language Technologies.2011:752-762.
[23]PAN N,SHEN X,QIAN J B,et al.A Line Trace Similarity Matching Algorithm Based on Improved Longest Common Substring:CN202010265484.8 [P].2023-08-21.
[24]YU T T,XU P N,JIANG Y E,et al.Based on Improved Jaccard Coefficient Document Similarity Calculation Methods [J].Computer System Application,2017,26(12):137-142.
[25]LI R L.Research on text categorization and its related tech-niques[D].Shanghai:Fudan University,2005.
[26]LI M.Construction Bidding Text Similarity Determination Based on N-gram Algorithm[D].Wuhan:Huazhong University of Science and Technology,2023.
[27]ZHAO Q.HowNet-based Semantic Similarity Calculation Me-thod for Short Texts [D].Taiyuan:Taiyuan University of Technology,2017.
[28]FENG G L,GAOS F.Text similarity algorithm based on vector space model combined with semantics[J].Modern Electronics Technique,2018,41(11):157-161.
[29]ZHAO D.Research and realization of automatic scoring system for subjective questions [D].Xi'an:Xidian University,2019.
[30]LI Q Y,LIU J Y,WANG P,et al.Automatic Scoring Model forSubjective Questions in English Online Exams Based on GloVe-CNN Algorithm [J].Journal of Guilin University of Technology,2023,43(1):155-160.
[31]JEFFREY P,RICHARD S,CHRISTOPHER M.Glove:Globalvectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).2014:1532-1543.
[32]XU Q T,ZXHANG L F,ZHU X H.An adaptive marking me-thod for subjective questions with integrated semantic technology and LSTM neural network [J].Journal of Guangxi Normal University(Natural Science Edition),2021,39(2):51-61.
[33]HUANG Y W,YANG X,ZHUANG F Z,et al.Automatic Chinese Reading Comprehension Grading by LSTM with Know-ledge Adaptation[C]//Advances in Knowledge Discovery and Data Mining.PAKDD,2018:118-129.
[34]TAN C Q,WEI F R,WANG W H,et al.Multiway AttentionNetworks for Modeling Sen-tence Pairs[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence.2018:4411-4417.
[35]MARCELO G H,SILVIA M B N,LUIS D L F V,et al.A Systematic Review of the Effects of Automatic Scoring and Automatic Feedback in Educational Settings[J].IEEE Access,2021,9:108190-108198.
[36]LIU Y X,LU Y X,DING L,et al.Bi-LSTM-based automaticmarking method for math subjective questions [J].Management and Observations,2018(2):5.
[37]CHEN J Y.Research on automatic scoring method of subjective questions in transportation field based on multi-feature fusion [D].Fuzhou:Fujian University of Technology,2023.
[38]TOMAS M,KAI C,GREGORY S C,et al.Efficient Estimationof Word Representations in Vector Space [C]//International Conference on Learning Representations.2013.
[39]XU C.An Automatic Scoring Algorithm for Chinese Subjective Questions Based on BERT Pre-training Models[D].Hangzhou:Hangzhou Dianzi University,2021.
[40]XIA L Z,YE J F,LUO D A,et al.An Automatic Scoring System for Short Texts Based on BERT-BiLSTM Models[J].Journal of Shenzhen University(Science and Technology),2022,39(3):349-354.
[41]QIAN S H.An Automatic Scoring System for Subjective Questions Based on Twin Networks and BERT Models [J].Compu-ter System Application,2022,31(3):143-149.
[42]YANG Z C,YANG D Y,CHRIS D,et al.Hierarchical Attention Networks for Document Classifica-tion[C]//Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2016:1480-1489.
[43]DONG F,ZHANG Y.Automatic Features for Essay Scoring-An Empirical Study[C]//Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing.2016:1072-1077.
[44]MASAKI U,XIE Y K,MAOMI U.Neural Automated EssayScoring Incorporating Hand-crafted Features[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:6077-6088.
[45]XIA L Z,LUO D A,LIU J,et al.A two-layer LSTM automatic essay scoring system based on attention mechanism [J].Journal of Shenzhen University(Science and Technology),2020,37(6):559-566.
[46]YANGR S,CAO J N,WEN Z Y et al,Enhancing automated essay scoring performance via ne-tuning pre-trained language mo-dels with combination of regression and ranking[C]//Findings of the Association for Computational Linguistics:EMNLP.2020:1560-1569.
[47]HAN J W.A system for automatic essay scoring and comment generation based on fine-grained sentiment analysis[D].Harbin:Harbin Institute of Technology,2022.
[48]NILS R,IRYNA G.Sentence-BERT:Sentence Embeddingsusing Siamese BERT-Networks[C]//Proceedings of the 2019 Conference on Empirical Methods in Natural Language Proces-sing and the 9th International Joint Conference on Natural Language Processing(EMNLP-IJCNLP).2019:3980-3990.
[49]WANG Y J,WANG C,LI R B,et al.On the Use of BERT for Automated Essay Scoring:Joint Learning of Multi-Scale Essay Representation[C]//Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2022:3416-3425.
[50]WU T Y,HE S Z,LIU J P,et al.A Brief Overview of Chat-GPT:The History,Status Quo and Potential Future Development[J].IEEE/CAA Journal of Automatica Sinica,2023,10(5):1122-1136.
[51]VALERIU M I,MADALIN C E.Using ChatGPT for Generating and Evaluating Online Tests[C]//2023 15th International Conference on Electronics,Computers and Artificial Intelligence(ECAI).2023:1-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!