计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 317-324.doi: 10.11896/jsjkx.230900076
史继筠1, 张驰1, 王禹桥1, 罗兆经2, 张美慧1
SHI Jiyun1, ZHANG Chi1, WANG Yuqiao1, LUO Zhaojing2, ZHANG Meihui1
摘要: 医疗报告自动生成是文本摘要生成技术的重要应用。由于医疗问诊数据与通用领域的数据特征存在着明显的差异,传统的文本摘要生成方法不能充分理解并利用医疗文本中高复杂性的医疗术语,因此医疗问诊中包含的关键知识并没有得到充分的利用。此外,传统的文本摘要生成方法大多是直接生成摘要,并没有针对医疗报告结构化的特点自动选择过滤关键信息并生成结构化文本的能力。针对上述问题,提出了一种知识辅助的结构化医疗报告生成方法。该方法将实体引导的先验领域知识与结构引导的任务解耦机制相结合,实现了对医疗问诊数据的关键知识与医疗报告的结构化特点的充分利用。在IMCS21数据集上的实验验证了所提方法的有效性,其生成摘要的ROUGE分数与同类方法相比提升了2%~3%,生成了更准确的医疗报告。
中图分类号:
[1]ZHOU Q,YANG N,WEI F,et al.Neural document summarization by jointly learning to ßscore and select sentences[C]//ACL 2018-56th Annual Meeting of the Association for Computational Linguistics,Proceedings of the Conference(Long Papers).Melbourne,VIC,Australia:2018:654-663. [2]RUSH A M,CHOPRA S,WESTON J.A neural attention model for abstractive sentence summarization[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing.Association for Computational Linguistics,2015:379-389. [3]LIU Y.Fine-tune BERT for extractive summarization[J].ar-Xiv:1903.10318,2019. [4]ZHONG M,LIU P,CHEN Y,et al.Extractive summarization as text matching[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.Seattle,Washington,USA:ACL,2020:6197-6208. [5]SEE A,LIU P J,MANNING C D.Get to the point:Summa-rization with pointer-generator networks[J].arXiv:1704.04368,2017. [6]PAULUS R,XIONG C,SOCHER R.A deep reinforced model for abstractive summarization[J].arXiv:1705.04304,2017. [7]LI W,XIAO X,LYU Y,et al.Improving neural abstractive document summarization with structural regulariza-tion[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:4078-4087. [8]LUO Z,YEUNG S H,ZHANG M,et al.MLCask:Efficientmanagement of component evolution in collaborative data analytics pipelines[C]//2021 IEEE International Conference on Data Engineering(ICDE).IEEE,2021:1655-1666. [9]LUO Z,CAI S,GAO J,et al.Adaptive lightweight regularization tool for complex analytics[C]//2018 IEEE International Conference on Data Engineering(ICDE).IEEE,2018:485-496. [10]LUO Z,CAI S,WANG Y,et al.Regularized Pairwise Relationship based Analytics for Structured Data[C]//Proceedings of the ACM on Management of Data.2023:1-27. [11]SONG Y,TIAN Y,WANG N,et al.Summarizing medical conversations via identifying important utterances[C]//Proceedings of the 28th International Conference on Computational Linguistics.2020:717-729. [12]ZHANG Y Z,JIANG Z T,ZHANG T,et al.MIE:A medical information extractortowards medical dialogues[C]//Proceedings of the 58th Annual Meeting of the Association for Computa-tional Linguistics.Association for Computational Linguistics.2020:6460-6469. [13]ENARVI S,AMOIA M,TEBA M A,et al.Generating medical reports from patient-doctor conversations using sequence-to-sequence models[C]//Proceedings of the First Workshop on Na-tural Language Processing for Medical Conversations.2020:22-30. [14]CHINTAGUNTA B,KATARIYA N,AMATRIAIN X,et al.Medically aware gpt-3 as a data generator for medical dialogue summarization[C]//Machine Learning for Healthcare Confe-rence.PMLR,2021:354-372. [15]KRISHNA K,KHOSLA S,BIGHAM J,et al.Generating soap notes from doctor-patient conversations using modular summarization techniques[C]//ACL-IJCNLP 2021-59th Annual Mee-ting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing,Proceedings of the Conference.Virtual,Online:2021:4958-4972. [16]LEWIS M,LIU Y,GOYAL N,et al.Bart:Denoising sequence-to-sequence pre-training for natural language generation,translation,and comprehension[J].arXiv:1910.13461,2019. [17]SOUZA F,NOGUEIRA R,LOTUFO R.Portuguese named en-tity recognition using BERT-CRF[J].arXiv:1909.10649,2019. [18]LIU P,YUAN W,FU J,et al.Pre-train,prompt,and predict:A systematic survey of prompting methods in natural language processing[J].ACM Computing Surveys,2023,55(9):1-35. [19]REBUFFI S A,BILEN H,VEDALDI A.Learning multiple vi-sual domains with residual adapters[J].arXiv:1705.08045,2017. [20]CHEN W,LI Z,FANG H,et al.A benchmark for automatic medical consultation system:frameworks,tasks and da-tasets[J].Bioinformatics,2023,39(1):817. [21]ZHANG N,CHEN M,BI Z,et al.Cblue:A chinese biomedical language understanding evaluation benchmark[J].arXiv:2106.08087,2021. [22]LIN C Y.Rouge:A package for automatic evaluation of summa-ries[C]//Text Summarization Branches out.2004:74-81. [23]QI W,GONG Y,YAN Y,et al.Prophetnet-x:Large-scale pre-training models for english,chinese,multi-lingual,dialog,and code generation[J].arXiv:2104.08006,2021. [24]CHEN X,YE J,ZU C,et al.How Robust is GPT-3.5 to Predecessors?A Comprehensive Study on Language Understanding Tasks[J].arXiv:2303.00293,2023. |
|