基于提示学习的记叙文篇章成分识别研究

doi:10.11896/jsjkx.240400043

计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 330-335.doi: 10.11896/jsjkx.240400043

基于提示学习的记叙文篇章成分识别研究

王晓艺¹, 王炯², 刘杰^1,3, 周建设¹

1 首都师范大学中国语言智能研究中心北京 100048
2 首都师范大学信息工程学院北京 100048
3 北方工业大学信息学院北京 100144

收稿日期:2024-04-07 修回日期:2024-09-12 出版日期:2025-06-15 发布日期:2025-06-11
通讯作者: 刘杰(liujxxxy@126.com)
作者简介:(2220101049@cnu.edu.cn)
基金资助:
国家科技创新2030-“新一代人工智能”重大项目(2020AAA0109703);国家自然科学基金(62076167,U23B2029)

Study on Text Component Recognition of Narrative Texts Based on Prompt Learning

WANG Xiaoyi¹, WANG Jiong², LIU Jie^1,3, ZHOU Jianshe¹

1 China Language Intelligence Research Center,Capital Normal University,Beijing 100048,China
2 School of Information Engineering,Capital Normal University,Beijing 100048,China
3 School of Information Technology,North China University of Technology,Beijing 100144,China

Received:2024-04-07 Revised:2024-09-12 Online:2025-06-15 Published:2025-06-11
About author:WANG Xiaoyi,born in 1997,Ph.D.Her main research interests include natural language processing and automated essay scoring.
LIU Jie,born in 1970,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.10359S).His main research in-terests include natural language processing and knowledge engineering.
Supported by:
National Key Research and Development Program of China(2020AAA0109703) and National Natural Science Foundation of China(62076167,U23B2029).

摘要/Abstract

摘要： 篇章结构分析是作文自动评分中的重要技术之一,也是自然语言处理领域中的重要研究内容。近年来,作文篇章结构分析的研究很少且主要集中于议论文,对记叙文的研究还较少,尤其是在记叙文篇章结构方面,研究方法和研究资源都相对有限。针对这些问题,文中构建了面向中小学记叙文篇章成分识别的数据集,使用基于BERT-BiLSTM的语料自动标注模型提高标注效率,并对内容分布以及语料标注的一致性进行了统计分析。提出了基于提示学习的记叙文篇章成分识别方法,通过自动构建识别篇章成分的前缀提示模板,利用层次注意力机制学习更为丰富的文本特征,从而提高记叙文篇章结构识别能力。在自建数据集下进行实验,结果表明,所提出的方法识别记叙文篇章结构的准确率提高到85.80%,优于对比的预训练语言模型。

关键词: 数据集构建, 篇章结构, 作文自动评分, 提示学习

Abstract: Text structure analysis is one of the important techniques in automated essay scoring and an important research topic in the field of natural language processing.In recent years,research on the analysis of essay structure has been scarce and mainly focused on argumentative essays.There are still shortcomings in the study of narrative texts,especially in terms of research me-thods and resources,which are relatively limited.In response to these issues,this paper constructs a corpus for identifying the components of narrative texts in primary and secondary schools.A corpus automatic annotation model based on BERT-BiLSTM is used to improve annotation efficiency,and statistical analysis is conducted on content distribution and consistency of corpus annotation.This paper proposes a narrative text component recognition method based on prompt learning,which automatically constructs prefix prompt templates for recognizing text components and utilizes hierarchical attention mechanism to learn richer text features,thereby improving the ability to recognize narrative text structure.Experiments are conducted on a self-built dataset,and the results show that the proposed method improves the accuracy of narrative discourse structure to 85.80%,which is superior to the pre-trained language models used for comparison.

Key words: Dataset construction, Text structure, Automated essay scoring, Prompt learning

中图分类号:

TP391

王晓艺, 王炯, 刘杰, 周建设. 基于提示学习的记叙文篇章成分识别研究[J]. 计算机科学, 2025, 52(6): 330-335. https://doi.org/10.11896/jsjkx.240400043

WANG Xiaoyi, WANG Jiong, LIU Jie, ZHOU Jianshe. Study on Text Component Recognition of Narrative Texts Based on Prompt Learning[J]. Computer Science, 2025, 52(6): 330-335. https://doi.org/10.11896/jsjkx.240400043

参考文献

[1]BURSTEIN J,MARCU D,KNIGHT K.Finding the WRITEStuff:Automatic Identification of Discourse Structure in Student Essays[J].IEEE Intelligent Systems,2003,18(1):32-39.
[2]STAB C,GUREVYCH I.Parsing Argumentation Structures in Persuasive Essays[J].Computational Linguistics,2017,43(3):619-659.
[3]WANG Q.A review of narrative writing for high school en-trance examination based on“scoring criteria” [J].Bulletin Chinese Language Teaching,2016,(32):71-72.
[4]XU F,ZHU Q M,ZHOU G D.Survey of Discourse Analysis Methods[J].Journal of Chinese Information Processing,2013,27(3):20-33.
[5]HU W C,YANG Y L,WU C X.Survey of Implicit Discourse Relation Recognition Based on Deep Learning[J].Computer Science,2020,47(4):157-163.
[6]KENTON J D M W C,TOUTANOVA L K.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[C]//Proceedings of NAACL-HLT.2019:4171-4186.
[7]RADFORD A,WU J,CHILD R,et al.Language models are un-supervised multitask learners[J].OpenAI blog,2019,1(8):9.
[8]SCHICK T,SCHÜTZE H.Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics.2021:255-269.
[9]PRENDINGER H.A novel discourse parser based on supportvector machine classification[C]//Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP.2009:665-673.
[10]SONG W,FU R,LIU L,et al.Discourse element identification in student essays based on global and local cohesion[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.2015:2255-2261.
[11]KONG F,ZHOU G.A CDT-styled end-to-end Chinese discourse parser[J].ACM Transactions on Asian and Low-Resource Language Information Processing,2017,16(4):1-17.
[12]XU S,LI P,ZHOU G,et al.Employing text matching network to recognise nuclearity in chinese discourse[C]//Proceedings of the 27th International Conference on Computational Linguistics.2018:525-535.
[13]ZHOU M,JIA Y M,ZHOU C L,et al.An automatic grading method for English compositions based on discourse structure [J].Computer Science,2019,46(3):234-241.
[14]WANG X Y,WANG J C,LIU J.Evaluation of Chinese Narrative Text Structure Based on Fusion of Text Component Recognition [J].Small Micro Computer System,2025,46(1):55-63.
[15]CHEN Y,ZHANG S,QI G,et al.Parameterizing context:Unleashing the power of parameter-efficient fine-tuning and incontext tuning for continual table semantic parsing[C]//NeurIPS 2023.2023.
[16]WU J,YU T,WANG R,et al.Infoprompt:Information-theore-tic soft prompt tuning for natural language understanding[J].arXiv:2306.04933,2023.
[17]CHO Y J,LIU L,XU Z,et al.Heterogeneous lora for federated fine-tuning of on-device foundation models[C]//International Workshop on Federated Learning in the Age of Foundation Models in Conjunction with NeurIPS 2023.2023.
[18]LIALIN V,DESHPANDE V,RUMSHISKY A.Scaling down to scale up:A guide to parameter-efficient fine-tuning[J].arXiv:2303.15647,2023.
[19]JOULIN A,GRAVE É,BOJANOWSKI P,et al.Bag of Tricks for Efficient Text Classification[C]//Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics.2017:427-431.
[20]CHEN Y.Convolutional neural network for sentence classification[D].Waterloo:University of Waterloo,2015.
[21]LI X L,LIANG P.Prefix-Tuning:Optimizing Continuous Pro-mptsfor Generation[C]//Proceedings of the 59th Annual Mee-ting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing.2021:4582-4597.
[22]CHEN N,LI X H.An Event Extraction Method Based on Template Prompt Learning[J].Data Analysis and Knowledge Discovery,2023,7(6):86-98.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于提示学习的记叙文篇章成分识别研究

Study on Text Component Recognition of Narrative Texts Based on Prompt Learning

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0