计算机科学 ›› 2024, Vol. 51 ›› Issue (10): 79-85.doi: 10.11896/jsjkx.240400087

• 智能教育技术及应用 • 上一篇    下一篇

面向慕课视频的关键信息检索系统设计

赵博程1, 包兰天1, 杨哲森2, 曹璇1, 苗启广1,3,4   

  1. 1 西安电子科技大学计算机科学与技术学院 西安 710071
    2 北京航天自动控制研究所 北京 100854
    3 西安市大数据与视觉智能重点实验室 西安 710071
    4 协同智能系统教育部重点实验室 西安 710071
  • 收稿日期:2024-04-15 修回日期:2024-07-03 出版日期:2024-10-15 发布日期:2024-10-11
  • 通讯作者: 苗启广(qgmiao@xidian.edu.cn)
  • 作者简介:(zhaobocheng@xidian.edu.cn)
  • 基金资助:
    国家自然科学基金(62272364);新一代人工智能国家科技重大专项(2022ZD0117103);陕西省重点研发计划(2024GH-ZDXM-47);陕西高等继续教育教学改革研究课题(21XJZ004)

Key Information Retrieval System for MOOC Videos

ZHAO Bocheng1, BAO Lantian1, YANG Zhesen2, CAO Xuan1, MIAO Qiguang1,3,4   

  1. 1 School of Computer Science and Technology,Xidian University, Xi'an 710071,China
    2 Beijing Aerospace Automatic Control Institute,Beijing 100854,China
    3 Xi'an Key Laboratory of Big Data and Intelligent Vision,Xi'an 710071,China
    4 Key Laboratory of Collaborative Intelligence Systems,Ministry of Education,Xidian University,Xi'an 710071,China
  • Received:2024-04-15 Revised:2024-07-03 Online:2024-10-15 Published:2024-10-11
  • About author:ZHAO Bocheng,born in 1989,Ph.D,associate professor,is a member of CCF(No.J0076M).His main research intere-sts include interactive learning,deep reinforcement learning,multi-objective game confrontation,etc.
    MIAO Qiguang,born in 1972,professor,doctoral supervisor,is a councillor of CCF(No.09025D).His main research interests include computer vision,smart education and multi-modal large language models.
  • Supported by:
    National Natural Science Foundations of China(62272364),National Science and Technology Major Project(2022ZD0117103),Key Research and Development Program of Shaanxi Province(2024GH-ZDXM-47) and Teaching Reform Project of Shaanxi Higher Continuing Education(21XJZ004).

摘要: 随着互联网技术的迅猛发展,慕课等在线教育平台日益受到广泛关注。慕课作为一种创新的教育形式,有效突破了传统教育模式的地域界限,实现了优质教育资源的全球共享。通过慕课,学习者能够根据个人兴趣自主选择课程,灵活安排学习时间与进度,且能便利地进行重复学习。然而,当前慕课平台在针对授课视频中的特定知识点进行时间定位时,仍存在很大挑战,导致用户在学习关键核心知识点时需频繁拖动视频进度以寻找相应视频片段。针对这一现状,提出了一种基于多重二分匹配的注意力机制模型的慕课视频知识抽取算法。算法框架的主体部分包括字幕文本识别与生成、字幕文本分段提取、知识点抽取模型,以及知识点检索模块。实验结果表明,相对于当前的知识点抽取模型,所提模型在Inspec,NUS,Krapivin,SemEval,KP20k等多个数据集上,在部分关键指标上达到了当前的最优表现,充分证明了本系统在实际应用中的潜力和价值。

关键词: 在线教育, 慕课, 视频检索, 关键短语生成, 知识点定位

Abstract: Thanks to the rapid advancement of Internet technology,online education platforms,particularly massive open online courses(MOOCs),have increasingly captured public attention.MOOCs represent a revolutionary educational approach,effectively eliminating the geographical boundaries inherent in traditional education models and fostering the worldwide dissemination of elite educational resources.These courses empower learners to cherry-pick courses based on their unique interests,create flexible study schedules,monitor their progress,and revisit materials as needed.Despite their versatility,current MOOC platforms still struggle to pinpoint precise knowledge nuggets within lecture videos.This often leads learners to constantly scrub through the video timeline,searching for relevant segments,thereby disrupting the learning continuum.In view of this situation,we introduce a MOOC video knowledge extraction algorithm,leveraging a multi-level binary matching attention mechanism model.This algorithmic framework integrates subtitle text recognition and generation,subtitle segment extraction,a knowledge point extraction model,and a retrieval module.Experimental results show that,compared with the current knowledge point extraction model,the method of this system has achieved the optimal performance on some key indicators on multiple datasets such as Inspec,NUS,Krapivin,SemEval,KP20k,which fully proves the potential and value of this system in practical applications.

Key words: Online education, Massive open online courses, Video retrieval, Key phrase generation, Knowledge location

中图分类号: 

  • TP18
[1]JIN H,NIE Y.“MOOC Going Abroad” Achieves a New Breakthrough in the Digitization of Higher Education [N].Guangming Daily,[2023-11-26](014).
[2]GLAZKOVA A,MOROZOV D.Multi-task fine-tuning for ge-nerating keyphrases in a scientific domain[C]//2023 IX International Conference on Information Technology and Nanotechno-logy(ITNT).Samara,Russian Federation,2023:1-5.
[3]SONG M,JIANG H,SHI S,et al.Is ChatGPT A Good Keyphrase Generator? A Preliminary Study[J].arXiv:2303.13001,2023.
[4]Micron-BERT:BERT-based Facial Micro-Expression Recogni-tion[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2023:1482-1492.
[5]YE J,GUI T,LUO Y,et al.One2Set:Generating Diverse Keyphrases as a Set[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:4598-4608.
[6]XIE B,SONG J,SHAO L,et al.From statistical methods to deep learning,automatic keyphrase prediction:A survey[J].Information Processing & Management,2023,60(4):103382.
[7]HULTH A,MEGYESI B.A study on automatically extractedkeywords in text categorization[C]//Proceedings of the 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics.2006:537-544.
[8]KIM Y,KIM M,CATTLE A,et al.Applying graph-based keyword extraction to document retrieval[C]//Proceedings of the Sixth International Joint Conference on Natural Language Processing.2013:864-868.
[9]RAMAKANTH P,MOHIT B.Multi-Reward Reinforced Sum-marization with Saliency and Entailment[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies,Volume 2(Short Papers).2018:646-653.
[10]BEREND G.Opinion expression mining by exploiting keyphrase extraction[C]//Proceedings of 5th International Joint Confe-rence on Natural Language Processing.2011:1162-1170.
[11]SALTON G,BUCKLEY C.Term-weighting approaches in automatic text retrieval[J].Information processing & management,1988,24(5):513-523.
[12]CAMPOS R,MANGARAVITE V,PASQUALI A,et al.Yake! collection-independent automatic keyword extractor[C]//Advances in Information Retrieval:40th European Conference on IR Research,ECIR 2018,Grenoble,France,March 26-29,2018,Proceedings 40.Springer International Publishing,2018:806-810.
[13]GOLLAPALLI S D,LI X L,YANG P.Incorporating expertknowledge into keyphrase extraction[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2017.
[14]WANG J,PENG H,HU J S.Automatic keyphrases extraction from document using neural network[C]//Advances in Machine Learning and Cybernetics:4th International Conference.Revised Selected Papers.Springer Berlin Heidelberg,2006:633-641.
[15]SUN S,LIU Z,XIONG C,et al.Capturing global informative-ness in open domain keyphrase extraction[C]//Natural Language Processing and Chinese Computing:10th CCF International Conference.Springer International Publishing,2021:275-287.
[16]MENG R,ZHAO S Q,HAN S G,et al.Deep Keyphrase Gene-ration[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:582-592.
[17]YUAN X D,WANG T,MENG R,et al.One Size Does Not Fit All:Generating and Evaluating Variable Number of Keyphrases[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:7961-7975.
[18]CHAN H P,CHEN W,WANG L,et al.Neural Keyphrase Ge-neration via Reinforcement Learning with Adaptive Rewards[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:2163-2174.
[19]LUO Y C,XU Y G,YE J C,et al.Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning[C]//Findings of the Association for Computational Linguistics.2021:497-507.
[20]JISHNU R C,SEO Y P,TUHIN K,et al.KPDROP:Improving Absent Keyphrase Generation[C]//Findings of the Association for Computational Linguistics.2022:4853-4870.
[21]ZHAO G,YIN G,YANG P,et al.Keyphrase Generation viaSoft and Hard Semantic Corrections[C]//Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.2022:7757-7768.
[22]CHEN B,IWAIHARA M.Enhancing Keyphrase Generation by BART Finetuning with Splitting and Shuffling[C]//Pacific Rim International Conference on Artificial Intelligence.Singapore:Springer Nature Singapore,2023:305-310.
[23]GARG K,CHOWDHURY J R,CARAGEA C.Data Augmentation for Low-Resource Keyphrase Generation[J].arXiv:2305.17968,2023.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!