计算机科学 ›› 2023, Vol. 50 ›› Issue (6): 175-182.doi: 10.11896/jsjkx.230200182

• 数据库&大数据&数据科学 • 上一篇    下一篇

基于Bloom分类法的CS1试题数据集的构建及其自动分类

董荣胜, 卫晨雨, 胡杰, 乔宇澄, 李凤英   

  1. 桂林电子科技大学广西可信软件重点实验室 广西 桂林 541004
  • 收稿日期:2023-02-24 修回日期:2023-04-17 出版日期:2023-06-15 发布日期:2023-06-06
  • 通讯作者: 李凤英(lfy@guet.edu.cn)
  • 作者简介:(ccrsdong@guet.edu.cn)
  • 基金资助:
    国家自然科学基金(62062029)

Construction and Automatic Classification of CS1 Test Questions Dataset Based on Bloom's Taxonomy

DONG Rongsheng, WEI Chenyu, HU Jie, QIAO Yucheng, LI Fengying   

  1. Guangxi Key Laboratory of Trusted Software,Guilin University of Electronic Technology,Guilin,Guangxi 541004,China
  • Received:2023-02-24 Revised:2023-04-17 Online:2023-06-15 Published:2023-06-06
  • About author:DONG Rongsheng,born in 1965,professor,is a senior member of China Computer Federation.His main research interests include knowledge graph and machine learning.LI Fengying,born in 1974,Ph.D,professor,is a member of China Computer Federation.Her main research interests include knowledge graph,machine learning and symbolic computing.
  • Supported by:
    National Natural Science Foundation of China(62062029).

摘要: 课程评估是教学改革的一个关键环节,涉及教学案例、试题以及课堂教学等方面的内容。针对计算课程的试题评估,引入Bloom分类法,以普林斯顿大学和桂林电子科技大学“计算机科学导论”课程(CS1)的试题为语料库,给出针对CS1的Bloom分类法认知过程维度和知识维度的相应动词种子库和名词种子库,对试题所能达到的Bloom分类法二维矩阵的位置进行标注,构建CS1试题分类数据集。采用机器学习技术,给出CS1试题自动分类模型TFERNIE-LR,该模型由CSTFPOS-IDF算法、ERNIE模型和LR分类器3部分组成。CSTFPOS-IDF算法是在TFPOS-IDF算法的基础上,通过计算课程关键词权重因子,来提高模型对计算课程关键词的关注程度,生成词权重。同时,基于实体知识增强预训练模型ERNIE进行试题词语级向量嵌入,组合词权重和词语级向量生成用于自动分类的试题文本向量。最后,采用LR分类器将试题自动分类到Bloom分类法二维矩阵。实验结果表明,TFERNIE-LR模型具有良好的性能,在认知过程维度和知识维度上的加权精确率分别达到了83.3%和96.1%。

关键词: Bloom分类法, 课程评估, CS1试题分类数据集, 动词种子库, 名词种子库, 自动分类

Abstract: Curriculum evaluation is a key link of teaching reform,which involves the evaluation of teaching cases,test questions and classroom teaching.In order to evaluate the test questions of computing courses,this paper introduces Bloom's taxonomy,and takes the test questions of “Introduction to Computer Science” course(CS1) of Princeton University and Guilin University of Electronic Science and Technology as corpus,and the corresponding verb seed bank and noun seed bank for the cognitive process dimension and knowledge dimension of Bloom's taxonomy for CS1 are given,the positions of the two-dimensional matrix of Bloom's taxonomy that could be reached by the test questions are manually labeled,classification dataset for CS1 test questions is constructed.Machine learning technology is used,the automatic classification model TFERNIE-LR of CS1 test questions is given,which is composed of CSTFPOS-IDF algorithm,ERNIE model and LR classifier.CSTFPOS-IDF algorithm is based on TFPOS-IDF algorithm,by the weight factor of the keywords in computing discipline,CSTFPOS-IDF algorithm pays more attention to the keywords improves and generates the weight of words.At the same time,the entity knowledge enhanced pre-training model ERNIE is used to embed the word level vector of test questions,and the combined word weight and word level vector are used to generate the text vector of test questions for automatic classification.Finally,the LR classifier is used to automatically classify test questions into Bloom's taxonomy two-dimensional matrix.Experimental results show that the proposed TFERNIE-LR model has good performance,and weighted-P in the cognitive process dimension and knowledge dimension reaches 83.3% and 96.1% respectively.

Key words: Bloom's taxonomy, Curriculum evaluation, Classification dataset for CS1 test questions, Verb seed bank, Noun seed bank, Automatic classification

中图分类号: 

  • TP391
[1]CLEAR A,PARRISH A,IMPAGLIAZZO J,et al.Computing Curricula2020(CC2020):Paradigms for Future Computing Curricula[R].New York:Technical Report,2020.
[2]BLOOM B S.Taxonomy of educational objectives:the classification of educational goals:Handbook 1:Cognitive domain[M].New York:David McKay Co.Inc,1956:1-9.
[3]ANDERSON L W,KRATHWOHL D R,AIRASIAN P W,et al.A Taxonomy for Learning,Teaching,and Assessing:A Revision of Bloom's Taxonomy of Educational Objectives[M].London:Longman Publishing Group,2001:25-80.
[4]ZHANG Z Y,HAN X,LIU Z Y,et al.ERNIE:enhanced language representation with informative entities[C]//Proceedings of the 57th Annual Meeting of the Association for Computa-tional Linguistics.Florence,2019:1441-1451.
[5]CHANG W C,CHUNG M S.Automatic applying Bloom's ta-xonomy to classify and analysis the cognition level of English question items[C]//2009 Joint Conferences on Pervasive Computing (JCPC).IEEE,2009:727-734.
[6]OMAR N,HARIS S S,HASSAN R,et al.Automated Analysis of Exam Questions According to Bloom's Taxonomy[J].Procedia-Social and Behavioral Sciences,2012,59:297-303.
[7]HARIS S S,OMAR N.Bloom's taxonomy question categorization using rules and N-gram approach[J].Journal of Theoretical &Applied Information Technology,2015,76(3):401-407.
[8]JAYAKODI K,BANDARA M,MEEDENIYA D.An automatic classifier for exam questions with WordNet and Cosine similarity[C]//2016 Moratuwa Engineering Research Conference(MERCon).IEEE,2016:12-17.
[9]FEI T,HENG W J,TOH K C,et al.Question classification for e-learning by artificial neural network[C]//Fourth International Conference on Information,Communications and Signal Proces-sing,2003 and the Fourth Pacific Rim Conference on Multimedia.Proceedings of the 2003 Joint.IEEE,2003:1757-1761.
[10]YUSOF N,HUI C J.Determination of Bloom's cognitive level of question items using artificial neural network[C]//2010 10th International Conference on Intelligent Systems Design and Applications.IEEE,2010:866-870.
[11]YAHYA A A,OSMAN A.Automatic classification of questions into Bloom's cognitive levels using support vector machines[C]//Proceedings of the International Arab Conference on Information Technology.Riyadh,Saudi Arabia.2011:335-342.
[12]ABDULJABBAR D A,OMAR N.Exam questions classification based on Bloom's taxonomy cognitive level using classifiers combination[J].Journal of Theoretical & Applied Information Technology,2015,78(3):447-455.
[13]MOHAMMED M,OMAR N.Question classification based onBloom's taxonomy cognitive domain using modified TF-IDF and word2vec[J].PLoS ONE,2020,15(3):e0230442.
[14]DONG R S.Introduction to Computer Science:Thinking andMethods(Third Edition)[M].Beijing:Higher Education Press,2015:1-335.
[15]SEDGEWICK R.Computer Science:An Interdisciplinary Ap-proach[M].GONG X L,et al,translate.Beijing:China Machine Press,2020:1-636.
[16]DEVLIN J,CHANG M,LEE K,et al.Bert:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.MN,USA:Association for Computational Linguistics,2019:4171-4186.
[17]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[18]CORTES C,VAPNIK V.Support-vector networks[J].Machine Learning,1995,20(3):273-297.
[19]COX D R.The regression analysis of binarysequences[J].Journal of the Royal Statistical Society:Series B (Methodological),1958,20(2):215-232.
[20]JOHNSON C G,FULLER U.Is Bloom's taxonomy appropriate for computer science?[C]//Proceedings of the 6th Baltic Sea conference on Computing education research:Koli Calling 2006.2006:120-123.
[21]SAHAMI M,ROACH S M.Computer Science curricula 2013[J].ACM SIGCSE Bulletin,2013:29-219.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!