计算机科学 ›› 2025, Vol. 52 ›› Issue (2): 253-260.doi: 10.11896/jsjkx.231200054
许思遥1, 曾健骏2, 张维彦2, 叶琪2, 朱焱1
XU Siyao1, ZENG Jianjun2, ZHANG Weiyan2, YE Qi2, ZHU Yan1
摘要: 依存句法分析是一项重要的自然语言处理任务,其目标是识别句子中词与词之间的依存关系。但在面向中文医疗电子病历的依存句法分析中,现有的研究存在以下问题:当出现缺省指示语法结构的成分和修饰成分位置多样的情况时,当前的通用解析器无法准确分析。针对该问题,提出基于大小语言模型协同增强的中文电子病历依存句法分析方法。首先,分析中文电子病历的语言特征,提出通过成分补全指示医疗文本中的特殊语法结构。然后,利用通用解析器进行依存句法分析,对于解析后的语法图,利用大语言模型的先验语法知识进行自动修正。此外,所提方法将重点放在缩小医疗文本与通用文本之间的特征分布差异上,故不受医疗领域缺少标注数据的限制。针对中文电子病历的依存句法分析,标注了444条测试样本,并对所提方法进行验证。实验表明该方法能有效地对中文电子病历进行依存分析,基于少量标注语料,LAS指标可达92.42,UAS指标可达94.60,并且在不同科室的中文电子病历上也能够达到同样显著的效果。
中图分类号:
[1]EISNER.Bilexical Grammars and their Cubic-Time Parsing Algorithms [J].Springer Netherlands,2000,10(7):29-61. [2]CHEN W L,ZHANG M,ZHANG Y.Semi-supervised FeatureTransformation for Dependency Parsing[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language.2013:1303-1313. [3]TIMOTHY D,CHRISTOPHER M.Deep Biaffine Attention for Neural Dependency Parsing[C]//Proceedings of the 2017 International Conference on Learning Representations.2017:1-8. [4]CHEN D Q,MANNING.A fast and accurate dependency parser using neural networks[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language.2014:740-750. [5]WEISS D,ALBERTI C,COLLINS M.Structured training forneural network transition-based parsing.[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics.2015. [6]DYER C,BALLESTEROS M,WANG L,et al.Transition-based dependency parsing with stack long short-term memory[C]//Proceedings of the 2015 Conference on Empirical Methods in Natural Language.2015. [7]KIPERWASSER E,GOLDBERG Y.Simple and accurate de-pendency parsing using bidirectional LSTM feature representations[C]//Proceedings of Annual Meeting of the Association for Computational Linguistics.2016:313-327. [8]DOZAT T,MANNING C.Deep biaffine attention for neural dependency parsing[J]. arXiv:1611.01734,2016. [9]MRINI K,DERNONCOURT F.Rethinking self-attention:To-wards interpretability in neural parsing [C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language.2020:731-742. [10]BADA M,PYYSALO S,CIOSICI M,et al.Craft SharedTasks 2019 Overview-Integrated Structure,Semantics,and Coreference[C]//Proceedings of the 5th Workshop on BioNLP Open Shared Tasks.2019:174-184. [11]NGO TM,KANERVA J,GINTER F,et al.Neural Dependency Parsing of Biomedical Text:TurkuNLP entry in the CRAFT structural annotation task[C]//Proceedings of the 5th Workshop on BioNLP Open Shared Tasks.2019:206-215. [12]JANG Z P,GUAN Y.A Fusion Model for Chinese Electronic Medical Record Parsing [J].ACTA Automatica Sinica,2019,45(2):276-288. [13]KOPF A,KILCHER Y,RUTTE D,et al.Open AssistantConversations-Democratizing Large Language Model Alignment[C]//Proceedings of the 2023 Conference and Workshop on Neural Information Processing Systems.2023:1-13. [14]WEI J,TAY Y,RISHI B,et al.Emergent Abilities of Large Language Models [J].arXiv:2206.07682,2022. [15]SUN X F,DONG L F.Pushing the Limits of ChatGPT on NLP Tasks[J].arXiv:2306.09719,2023. [16]SCHICK T,SCHÜTZE H.Exploiting Cloze-questions for Few-shot Text Classification and Natural Language Inference[C]//Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics:Main Volume.2021:255-269. [17]GUNTER T D,TERRY N P.The Emergence of National Electronic Health Record Architectures in the United States and Australia:Models,Costs,and Questions [J].Journal of Medical Internet Research,2005,7(1):e3. [18]YEH C L,CHEN Y C.Zero Anaphora Resolution in Chinese with Shallow Parsing [J].Journal of Chinese Language and Computing,2007,17(1):41-56. [19]JIANG M,HUANG Y,FAN J W,et al. Parsing Clinical Text:How Good Are the State-of-the-art Parsers?[J] BMC Medical Informatics and Decision Making,2015,15(S1):1-6. [20]SHI J L,LUO X Y.Construction of a Treebank of LearnersChinese [J].Journal of Chinese Information Processing,2022,36(1):39-46. [21]CHE W,FENG Y,QIN L,et al.N-LTP:An Open-source Neural Language Technology Platform for Chinese[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language.2021:42-49. [22]PLANK B,ALONSO H M,AGIĆ ,et al.Do dependencyparsing metrics correlate with human judgments?[C]//Proceedings of the 19th Conference on Computational Natural Language Learning.2015:315-320. [23]HAN H,CHOI J D.The Stem Cell Hypothesis:Dilemma behind Multi-Task Learning with Transformer Encoders[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language.2021:5555-5577. [24]ZHANG S,WANG L,SUN K,et al.A practical Chinese dependency parser based on a large-scale dataset [J]. arXiv:2009.00901,2020. [25]ZHANG Y,CUI L.Siren's Song in the AI Ocean:A Survey on Hallucination in Large Language Models [J].arXiv:2309.01219,2023. |
|