Computer Science ›› 2024, Vol. 51 ›› Issue (5): 258-266.doi: 10.11896/jsjkx.230300007

• Artificial Intelligence • Previous Articles     Next Articles

Prompt Learning-based Generative Approach Towards Medical Dialogue Understanding

LIU Jun, RUAN Tong, ZHANG Huanhuan   

  1. School of Information Science and Engineering,East China University of Science and Technology,Shanghai 200237,China
  • Received:2023-03-01 Revised:2023-06-26 Online:2024-05-15 Published:2024-05-08
  • About author:LIU Jun,born in 1998,postgraduate.His main research interests include information extraction and language mo-del.
    ZHANG Huanhuan,born in 1968,Ph.D,associate professor.Her main research interests include knowledge graph and so on.
  • Supported by:
    National Key Research and Development Program of China(2021YFC2701800,2021YFC2701801).

Abstract: The goal of the dialogue understanding module in task-oriented dialogue systems is to convert the user's natural language input into a structured form.However,in the diagnosis-oriented medical dialogue system,the existing approaches face the following problems:1)the granularity of the information cannot fully satisfy the needs of diagnosis,such as providing the severity of a symptom;2)it is difficult to simultaneously satisfy the diverse representations of slot values in the medical domain,such as “symptom”,which may contain non-contiguous and nested entities,and “negation”,which may contain categorical value.This paper proposes a generative medical dialogue understanding method based on prompt learning.To address problem 1),this paper replaces the single-level slot structure in the current dialogue understanding task with a multi-level slot structure to represent finer-grained information,and then proposes a generative approach based on dialogue-style prompts,which uses prompt tokens to simulate the dialogue between doctor and patient and obtain multi-level information from multiple rounds of interaction.To address problem 2),this paper proposes the use of a restricted decoding strategy in the inference process,so that the model can handle the intention detection and slot-filling tasks of extractive and categorical slots in a unified manner to avoid complex modeling.In addition,to address the problem of lacking labeled data in the medical domain,this paper proposes a two-stage training strategy to leverage the large-scale unlabeled medical dialogue corpus to improve performance.In this paper,a dataset containing 4 722 dialogues involving 17 intentions and 74 types of slots is annotated and published for the medical dialogue understanding task containing a multi-level slot structure.Experiment shows that the proposed approach can effectively parse various complex entities in medical dialogues,with 2.18% higher performance compared to existing generation methods.The two-stage training can improve the performance of the model by up to 5.23% in the scenario with little data.

Key words: Prompt learning, Natural language understanding, Medical dialogue system, Generative model, Two-stage training

CLC Number: 

  • TP391
[1]LI Y,NI P,PENG J,et al.A joint model of clinical domain classification and slot filling based on RCNN and BiGRU-CRF[C]//2019 IEEE International Conference on Big Data(Big Data).IEEE,2019:6133-6135.
[2]LIN Z,LIU B,MADOTTO A,et al.Zero-Shot Dialogue State Tracking via Cross-Task Transfer[C]//Proceedings of the 2021 Conference on Empirical Methods in Natural Language Proces-sing.2021:7890-7900.
[3]BUDZIANOWSKI P,WEN T H,TSENG B H,et al.Multi-WOZ-A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Proces-sing.2018:5016-5026.
[4]ERIC M,GOEL R,PAUL S,et al.MultiWOZ 2.1:A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines[C]//Proceedings of the 12th Language Resources and Evaluation Conference.2020:422-428.
[5]ZANG X,RASTOGI A,SUNKARA S,et al.MultiWOZ 2.2:A Dialogue Dataset with Additional Annotation Corrections and State Tracking Baselines[C]//Proceedings of the 2nd Workshop on Natural Language Processing for Conversational AI.2020:109-117.
[6]RAFFEL C,SHAZEER N,ROBERTS A,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].Journal of Machine Learning Research,2020,21(140):1-67.
[7]LIAO K,LIU Q,WEI Z,et al.Task-oriented dialogue system for automatic disease diagnosis via hierarchical reinforcement learning[J].arXiv:2004.14254,2020.
[8]WEI Z,LIU Q,PENG B,et al.Task-oriented dialogue systemfor automatic diagnosis[C]//Proceedings of the 56th AnnualMeeting of the Association for Computational Linguistics(Volume 2:Short Papers).2018:201-207.
[9]WANG Z,YANG Y,WEN R,et al.Lifelong learning based disease diagnosis on clinical notes[C]//Pacific-Asia Conference on Knowledge Discovery and Data Mining.Springer,Cham,2021:213-224.
[10]SHI X,HU H,CHE W,et al.Understanding medical conversations with scattered keyword attention and weak supervision from responses[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8838-8845.
[11]CHEN L,LV B,WANG C,et al.Schema-guided multi-domain dialogue state tracking with graph attention neural networks[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:7521-7528.
[12]DU X,HE L,LI Q,et al.QA-Driven Zero-shot Slot Filling with Weak Supervision Pretraining[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 2:Short Papers).2021:654-664.
[13]LU L,KONG F.Dialogue-based Entity Relation Extraction with Knowledge[J].Computer Science,2022,49(5):200-205.
[14]WU C S,MADOTTO A,HOSSEINI-ASL E,et al.Transferable Multi-Domain State Generator for Task-Oriented Dialogue Systems[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:808-819.
[15]KIM S,YANG S,KIM G,et al.Efficient Dialogue State Tra-cking by Selectively Overwriting Memory[C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020:567-582.
[16]RASTOGI A,ZANG X,SUNKARA S,et al.Towards scalable multi-domain conversational agents:The schema-guided dialogue dataset[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2020:8689-8696.
[17]FENG Y,WANG Y,LI H.A Sequence-to-Sequence Approachto Dialogue State Tracking[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:1714-1725.
[18]LAI S,XU L,LIU K,et al.Recurrent convolutional neural net-works for text classification[C]//Twenty-ninth AAAI Confe-rence on Artificial Intelligence.2015.
[19]LIU W,TANG J,QIN J,et al.Meddg:A large-scale medicalconsultation dataset for building medical dialogue system[J].arXiv:2010.07497,2020.
[20]DONG L,YANG N,WANG W,et al.Unified language model pre-training for natural language understanding and generation[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.2019:13063-13075.
[21]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018.
[22]GAO S,AGARWAL S,CHUNG T,et al.From machine reading comprehension to dialogue state tracking:Bridging the gap[J].arXiv:2004.05827,2020.
[23]YANG P,HUANG H Y,MAO X L.Comprehensive Study:How the Context Information of Different Granularity Affects Dialogue State Tracking?[C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).2021:2481-2491.
[24]CUI Y,CHE W,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514.
[25]DAI Z,WANG X,NI P,et al.Named entity recognition using BERT BiLSTM CRF for Chinese electronic health records[C]//2019 12th International Congress on Image and Signal Proces-sing,Biomedical Engineering and Informatics(CISP-BMEI).IEEE,2019:1-5.
[26]TAN Z,SHEN Y,ZHANG S,et al.A sequence-to-set network for nested named entity recognition[J].arXiv:2105.08901,2021.
[27]SU J.A Hierarchical Relation Extraction Model with Pointer-Tagging Hybrid Structure[EB/OL].https://github.com/bojone/kg-2019.
[1] GE Yinchi, ZHANG Hui, SUN Haohang. Differential Privacy Data Synthesis Method Based on Latent Diffusion Model [J]. Computer Science, 2024, 51(3): 30-38.
[2] YAN Zhihao, ZHOU Zhangbing, LI Xiaocui. Survey on Generative Diffusion Model [J]. Computer Science, 2024, 51(1): 273-283.
[3] YAO Dong, LI Zhou-jun, CHEN Shu-wei, JI Zhen, ZHANG Rui, SONG Lei, LAN Hai-bo. Task-oriented Dialogue System and Technology Based on Deep Learning [J]. Computer Science, 2021, 48(5): 232-238.
[4] HU Yu-jie, CHANG Jian-hui, ZHANG Jian. Image Synthesis with Semantic Region Style Constraint [J]. Computer Science, 2021, 48(2): 134-141.
[5] CAO Wei-dong, XU Zhi-xiang, WANG Jing. Intrusion Detection Based on Semi-supervised Learning with Deep Generative Models [J]. Computer Science, 2019, 46(3): 197-201.
[6] PANG Xiong-wen, WAN Ben-shuai and WANG Pan. Micro-blog’s Text Classification Based on MRT-LDA [J]. Computer Science, 2017, 44(8): 236-241.
[7] FU Jian-hui,CAO Cun-gen,WANG Shi. Approach to Recognizing Chinese Metaphorical Phrases Based on Distinction Words [J]. Computer Science, 2010, 37(10): 193-196.
[8] JIA Yu-xiang ,YU Shi-wen (Institute of Computational Linguistics,Peking University,Beijing 100871 ,China). [J]. Computer Science, 2009, 36(3): 138-141.
[9] SU Chang. ZHOU Chang-Le (Institute of Artificial Intelligence, Xiamen University, Xiamen 361005). [J]. Computer Science, 2007, 34(8): 1-3.
[10] . [J]. Computer Science, 2007, 34(1): 166-169.
[11] . [J]. Computer Science, 2006, 33(9): 152-154.
[12] . [J]. Computer Science, 2006, 33(8): 178-183.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!