计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 330-335.doi: 10.11896/jsjkx.240400043

• 人工智能 • 上一篇    下一篇

基于提示学习的记叙文篇章成分识别研究

王晓艺1, 王炯2, 刘杰1,3, 周建设1   

  1. 1 首都师范大学中国语言智能研究中心 北京 100048
    2 首都师范大学信息工程学院 北京 100048
    3 北方工业大学信息学院 北京 100144
  • 收稿日期:2024-04-07 修回日期:2024-09-12 出版日期:2025-06-15 发布日期:2025-06-11
  • 通讯作者: 刘杰(liujxxxy@126.com)
  • 作者简介:(2220101049@cnu.edu.cn)
  • 基金资助:
    国家科技创新2030-“新一代人工智能”重大项目(2020AAA0109703);国家自然科学基金(62076167,U23B2029)

Study on Text Component Recognition of Narrative Texts Based on Prompt Learning

WANG Xiaoyi1, WANG Jiong2, LIU Jie1,3, ZHOU Jianshe1   

  1. 1 China Language Intelligence Research Center,Capital Normal University,Beijing 100048,China
    2 School of Information Engineering,Capital Normal University,Beijing 100048,China
    3 School of Information Technology,North China University of Technology,Beijing 100144,China
  • Received:2024-04-07 Revised:2024-09-12 Online:2025-06-15 Published:2025-06-11
  • About author:WANG Xiaoyi,born in 1997,Ph.D.Her main research interests include natural language processing and automated essay scoring.
    LIU Jie,born in 1970,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.10359S).His main research in-terests include natural language processing and knowledge engineering.
  • Supported by:
    National Key Research and Development Program of China(2020AAA0109703) and National Natural Science Foundation of China(62076167,U23B2029).

摘要: 篇章结构分析是作文自动评分中的重要技术之一,也是自然语言处理领域中的重要研究内容。近年来,作文篇章结构分析的研究很少且主要集中于议论文,对记叙文的研究还较少,尤其是在记叙文篇章结构方面,研究方法和研究资源都相对有限。针对这些问题,文中构建了面向中小学记叙文篇章成分识别的数据集,使用基于BERT-BiLSTM的语料自动标注模型提高标注效率,并对内容分布以及语料标注的一致性进行了统计分析。提出了基于提示学习的记叙文篇章成分识别方法,通过自动构建识别篇章成分的前缀提示模板,利用层次注意力机制学习更为丰富的文本特征,从而提高记叙文篇章结构识别能力。在自建数据集下进行实验,结果表明,所提出的方法识别记叙文篇章结构的准确率提高到85.80%,优于对比的预训练语言模型。

关键词: 数据集构建, 篇章结构, 作文自动评分, 提示学习

Abstract: Text structure analysis is one of the important techniques in automated essay scoring and an important research topic in the field of natural language processing.In recent years,research on the analysis of essay structure has been scarce and mainly focused on argumentative essays.There are still shortcomings in the study of narrative texts,especially in terms of research me-thods and resources,which are relatively limited.In response to these issues,this paper constructs a corpus for identifying the components of narrative texts in primary and secondary schools.A corpus automatic annotation model based on BERT-BiLSTM is used to improve annotation efficiency,and statistical analysis is conducted on content distribution and consistency of corpus annotation.This paper proposes a narrative text component recognition method based on prompt learning,which automatically constructs prefix prompt templates for recognizing text components and utilizes hierarchical attention mechanism to learn richer text features,thereby improving the ability to recognize narrative text structure.Experiments are conducted on a self-built dataset,and the results show that the proposed method improves the accuracy of narrative discourse structure to 85.80%,which is superior to the pre-trained language models used for comparison.

Key words: Dataset construction, Text structure, Automated essay scoring, Prompt learning

中图分类号: 

  • TP391
[1]BURSTEIN J,MARCU D,KNIGHT K.Finding the WRITEStuff:Automatic Identification of Discourse Structure in Student Essays[J].IEEE Intelligent Systems,2003,18(1):32-39.
[2]STAB C,GUREVYCH I.Parsing Argumentation Structures in Persuasive Essays[J].Computational Linguistics,2017,43(3):619-659.
[3]WANG Q.A review of narrative writing for high school en-trance examination based on“scoring criteria” [J].Bulletin Chinese Language Teaching,2016,(32):71-72.
[4]XU F,ZHU Q M,ZHOU G D.Survey of Discourse Analysis Methods[J].Journal of Chinese Information Processing,2013,27(3):20-33.
[5]HU W C,YANG Y L,WU C X.Survey of Implicit Discourse Relation Recognition Based on Deep Learning[J].Computer Science,2020,47(4):157-163.
[6]KENTON J D M W C,TOUTANOVA L K.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186.
[7]RADFORD A,WU J,CHILD R,et al.Language models are un-supervised multitask learners[J].OpenAI blog,2019,1(8):9.
[8]SCHICK T,SCHÜTZE H.Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:255-269.
[9]PRENDINGER H.A novel discourse parser based on supportvector machine classification[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.2009:665-673.
[10]SONG W,FU R,LIU L,et al.Discourse element identification in student essays based on global and local cohesion[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:2255-2261.
[11]KONG F,ZHOU G.A CDT-styled end-to-end Chinese discourse parser[J].ACM Transactions on Asian and Low-Resource Language Information Processing,2017,16(4):1-17.
[12]XU S,LI P,ZHOU G,et al.Employing text matching network to recognise nuclearity in chinese discourse[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:525-535.
[13]ZHOU M,JIA Y M,ZHOU C L,et al.An automatic grading method for English compositions based on discourse structure [J].Computer Science,2019,46(3):234-241.
[14]WANG X Y,WANG J C,LIU J.Evaluation of Chinese Narrative Text Structure Based on Fusion of Text Component Recognition [J].Small Micro Computer System,2025,46(1):55-63.
[15]CHEN Y,ZHANG S,QI G,et al.Parameterizing context:Unleashing the power of parameter-efficient fine-tuning and incontext tuning for continual table semantic parsing[C]//NeurIPS 2023.2023.
[16]WU J,YU T,WANG R,et al.Infoprompt:Information-theore-tic soft prompt tuning for natural language understanding[J].arXiv:2306.04933,2023.
[17]CHO Y J,LIU L,XU Z,et al.Heterogeneous lora for federated fine-tuning of on-device foundation models[C]//International Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023.2023.
[18]LIALIN V,DESHPANDE V,RUMSHISKY A.Scaling down to scale up:A guide to parameter-efficient fine-tuning[J].arXiv:2303.15647,2023.
[19]JOULIN A,GRAVE É,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2017:427-431.
[20]CHEN Y.Convolutional neural network for sentence classification[D].Waterloo:University of Waterloo,2015.
[21]LI X L,LIANG P.Prefix-Tuning:Optimizing Continuous Pro-mptsfor Generation[C]//Proceedings of the 59th Annual Mee-ting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:4582-4597.
[22]CHEN N,LI X H.An Event Extraction Method Based on Template Prompt Learning[J].Data Analysis and Knowledge Discovery,2023,7(6):86-98.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!