计算机科学 ›› 2024, Vol. 51 ›› Issue (5): 258-266.doi: 10.11896/jsjkx.230300007
柳俊, 阮彤, 张欢欢
LIU Jun, RUAN Tong, ZHANG Huanhuan
摘要: 任务型对话系统中的对话理解模块的目标是将用户输入的自然语言转换成结构化的形式,但在面向诊断的医疗对话系统中,现有方法存在如下问题:1)无法支持精准医疗所需的信息粒度,如给出某一症状的严重程度;2)难以同时满足医疗领域中多样化的槽值表示形式,如“症状”等可能含有非连续与嵌套实体的抽取型槽以及“否定”等分类型槽。文中提出了一种基于提示学习的多层次生成式医疗对话理解方法。针对问题1),用多层次槽结构替代当前对话理解任务中单层的槽结构,以表示更细粒度的信息,之后采用一种基于对话风格提示的生成式方法,利用提示字符模拟医患对话,从多轮交互中获得多层次信息。针对问题2),提出在推理过程中使用一种受限的解码策略,使模型能够以统一的方式处理意图识别与分类型和抽取型的槽填充任务,避免复杂的建模。此外,针对医疗领域缺少标注数据的问题,提出了一种两阶段训练策略,以充分利用大规模的无标注医疗对话语料来提升性能。针对含有多层次槽结构的医疗对话理解任务标注并发布了一个数据集,包含4 722条对话,涉及17种意图与74种槽。实验结果表明,所提方法能够有效解析医疗对话中的各种复杂实体,相比已有的生成方法,其性能高出2.18%,而在小样本的场景下两阶段训练最高能提高模型5.23%的性能。
中图分类号:
[1]LI Y,NI P,PENG J,et al.A joint model of clinical domain classification and slot filling based on RCNN and BiGRU-CRF[C]//2019 IEEE International Conference on Big Data(Big Data).IEEE,2019:6133-6135. [2]LIN Z,LIU B,MADOTTO A,et al.Zero-Shot Dialogue State Tracking via Cross-Task Transfer[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Proces-sing.2021:7890-7900. [3]BUDZIANOWSKI P,WEN T H,TSENG B H,et al.Multi-WOZ-A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:5016-5026. [4]ERIC M,GOEL R,PAUL S,et al.MultiWOZ 2.1:A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines[C]//Proceedings of the 12th Language Resources and Evaluation Conference.2020:422-428. [5]ZANG X,RASTOGI A,SUNKARA S,et al.MultiWOZ 2.2:A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines[C]//Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI.2020:109-117. [6]RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].Journal of Machine Learning Research,2020,21(140):1-67. [7]LIAO K,LIU Q,WEI Z,et al.Task-oriented dialogue system for automatic disease diagnosis via hierarchical reinforcement learning[J].arXiv:2004.14254,2020. [8]WEI Z,LIU Q,PENG B,et al.Task-oriented dialogue systemfor automatic diagnosis[C]//Proceedings of the 56th AnnualMeeting of the Association for Computational Linguistics(Volume 2:Short Papers).2018:201-207. [9]WANG Z,YANG Y,WEN R,et al.Lifelong learning based disease diagnosis on clinical notes[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.Springer,Cham,2021:213-224. [10]SHI X,HU H,CHE W,et al.Understanding medical conversations with scattered keyword attention and weak supervision from responses[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8838-8845. [11]CHEN L,LV B,WANG C,et al.Schema-guided multi-domain dialogue state tracking with graph attention neural networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:7521-7528. [12]DU X,HE L,LI Q,et al.QA-Driven Zero-shot Slot Filling with Weak Supervision Pretraining[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 2:Short Papers).2021:654-664. [13]LU L,KONG F.Dialogue-based Entity Relation Extraction with Knowledge[J].Computer Science,2022,49(5):200-205. [14]WU C S,MADOTTO A,HOSSEINI-ASL E,et al.Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:808-819. [15]KIM S,YANG S,KIM G,et al.Efficient Dialogue State Tra-cking by Selectively Overwriting Memory[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:567-582. [16]RASTOGI A,ZANG X,SUNKARA S,et al.Towards scalable multi-domain conversational agents:The schema-guided dialogue dataset[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8689-8696. [17]FENG Y,WANG Y,LI H.A Sequence-to-Sequence Approachto Dialogue State Tracking[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:1714-1725. [18]LAI S,XU L,LIU K,et al.Recurrent convolutional neural net-works for text classification[C]//Twenty-ninth AAAI Confe-rence on Artificial Intelligence.2015. [19]LIU W,TANG J,QIN J,et al.Meddg:A large-scale medicalconsultation dataset for building medical dialogue system[J].arXiv:2010.07497,2020. [20]DONG L,YANG N,WANG W,et al.Unified language model pre-training for natural language understanding and generation[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:13063-13075. [21]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [22]GAO S,AGARWAL S,CHUNG T,et al.From machine reading comprehension to dialogue state tracking:Bridging the gap[J].arXiv:2004.05827,2020. [23]YANG P,HUANG H Y,MAO X L.Comprehensive Study:How the Context Information of Different Granularity Affects Dialogue State Tracking?[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:2481-2491. [24]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514. [25]DAI Z,WANG X,NI P,et al.Named entity recognition using BERT BiLSTM CRF for Chinese electronic health records[C]//2019 12th International Congress on Image and Signal Proces-sing,Biomedical Engineering and Informatics(CISP-BMEI).IEEE,2019:1-5. [26]TAN Z,SHEN Y,ZHANG S,et al.A sequence-to-set network for nested named entity recognition[J].arXiv:2105.08901,2021. [27]SU J.A Hierarchical Relation Extraction Model with Pointer-Tagging Hybrid Structure[EB/OL].https://github.com/bojone/kg-2019. |
|