幻觉主动缓解的糖尿病问诊大模型

doi:10.11896/jsjkx.240700182

Abstract

Abstract: The treatment of diabetes is a long-term and highly personalized endeavor and imposes a significant burden on patients’ daily lives.Diabetes consultation through medical large language models(LLMs) can effectively alleviate the medical healthcare burden of patients.But LLMs are more likely to produce hallucinations,i.e.,outputs that are incorrect,meaningless,or mismatched with the input,when processing texts in specialized domains such as medicine.And the accuracy rate of existing hallucination relief techniques in the medical field is not satisfactory,which will greatly affect the accuracy rate of the LLMs.To address this problem,this paper proposes a hallucination self-inspection and proactive relief method that combines instruction fine-tuning and retrieval augmented generation to form additional knowledge about user questions before the generation process,and to determine whether a hallucination is generated by similarity comparison after the generation process.Experiments are conducted on several medical datasets,and an F1 value of 0.79,a BLEU-4 value of 2.38,and a Rouge-l value of 9.26 are achieved on the large-scale diabetic multi-round conversation dataset,which outperforms the existing hallucination relief techniques for LLMs in terms of accuracy and generation efficiency.

Key words: Large language model, Retrieval augmented generation, Hallucination relief, Diabetes, Question and answer system

CLC Number:

F416

ZHANG Le, CHE Chao, LIANG Yan. Hallucinations Proactive Relief in Diabetes Q&A LLM[J].Computer Science, 2025, 52(6A): 240700182-10.

References

[1]ZENG A,LIU X,DU Z,et al.GLM-130B:An Open Bilingual Pre-Trained Model [C]//The Eleventh International Conference on Learning Representations,ICLR 2023,Kigali,Rwanda,May 1-5,2023.OpenReview.net,2023.
[2]SUN Y,WANG S,FENG S,et al.Ernie 3.0:Large-scale knowledge enhanced pre-training for language understanding and generation [J].arXiv:2107.02137,2021.
[3]BAI J,BAI S,CHU Y,et al.Qwen technical report [J].arXiv:2309.16609,2023.
[4]TOUVRON H,MARTIN L,STONE K,et al.Llama 2:OpenFoundation and Fine-Tuned Chat Models [J].arXiv:2307.09288,2023.
[5]VARSHNEY N,YAO W,ZHANG H,et al.A Stitch in Time Saves Nine:Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation [J].arXiv:2307.03987,2023.
[6]LI Y,LI Z,ZHANG K,et al.ChatDoctor:A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI(LLaMA) Using Medical Domain Knowledge [J].Cureus,2023,15(6):1-12.
[7]WANG H,LIU C,XI N,et al.Huatuo:Tuning LLaMA Model with Chinese Medical Knowledge [J].arXiv:2304.06975,2023.
[8]LIAO Y,MENG Y,LIU H,et al.MING:Chinese Medical Consultation Large Model [EB/OL].(2023-01-01) [2024-07-24].https://github.com/MediaBrain-SJTU/MING.
[9]WANG H,ZHAO S,QIANG Z,et al.Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese [J].arXiv:2309.04175,2023.
[10]LEE N,PING W,XU P,et al.Factuality enhanced language models for open-ended text generation [J].Advances in Neural Information Processing Systems,2022,35:34586-34599.
[11]RASHKIN H,REITTER D,TOMAR G S,et al.IncreasingFaithfulness in Knowledge-Grounded Dialogue with Controllable Features [C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).Association for Computational Linguistics,2021:704-718.
[12]LI Y,YAO K,QIN L,et al.Slot-consistent NLG for task-oriented dialogue systems with iterative rectification network [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020.
[13]CHEN S,ZHANG F,SONE K,et al.Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection [EB/OL].(2021) [Association for Computational Linguistics].https://aclanthology.org/2021.naacl-main.475.
[14]PENG B,GALLEY M,HE P,et al.Check your facts and tryagain:Improving large language models with external knowledge and automated feedback [J].arXiv:2302.12813,2023.
[15]MANAKUL P,LIUSIEA,GALES M.SelfCheckGPT:Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Singapore,2023.Association for Computational Linguistics.
[16]AZARIA A,AZOULAY R,RECHES S.ChatGPT is a remarkable tool－For experts [J].Data Intelligence,2023,5(4):1-49.
[17]LIU X,JI K,FU Y,et al.P-tuning v2:Prompt tuning can be comparable to fine-tuning universally across scales and tasks [J].arXiv:2110.07602,2021.
[18]TAORI R,GULRAJANI I,ZHANG T,et al.Alpaca:A strong,replicable instruction-following model [J].Stanford Center for Research on Foundation Models,2023,3(6):7.
[19]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space [C]//1st International Conference on Learning Representations(ICLR 2013).Scottsdale,Arizona,USA,May 2-4,2013,Workshop Track Proceedings.Y.Bengio and Y.LeCun(eds.),2013.
[20]DU Z,QIAN Y,LIU X,et al.GLM:General Language Model Pretraining with Autoregressive Blank Infilling [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Dublin,Ireland,2022.Association for Computational Linguistics.
[21]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [J].Advances in Neural Information Processing Systems,2017,30.
[22]SHEN J,WU Y,SHANG J,et al.DeepNet:Scaling Transformers to 1,000 Layers [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2023:10777-10786.
[23]SU J,AHMED M,LU Y,et al.Roformer:Enhanced Transformer with Rotary Position Embedding [J].Neurocomputing,2024,568:127063.
[24]HENDRYCKS D,GIMPEL K.Gaussian error linear units(gelus) [J].arXiv:1606.08415,2016.
[25]HOULSBY N,GIURGIU A,JASTRZEBSKI S,et al.Parameter-efficient transfer learning for NLP [C]//International Conference on Machine Learning.PMLR,2019.
[26]LIU X,JI K,FU Y,et al.P-tuning:Prompt tuning can be comparable to fine-tuning across scales and tasks [C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2022.
[27]ZHANG T,KISHORE V,WU F,et al.BERTScore:Evaluating Text Generation with BERT [C]//8th International Conference on Learning Representations(ICLR 2020).Addis Ababa,Ethiopia,April 26-30,2020.OpenReview.net,2020.
[28]LIN C Y.ROUGE:A Package for Automatic Evaluation ofSummaries [C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics.Barcelona,Spain,2004.Association for Computational Linguistics.
[29]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:A Method for Automatic Evaluation of Machine Translation [C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002.
[30]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners [J].Advances in Neural Information Processing Systems,2020,33:1877-1901.
[31]LIN S,HILTON J,EVANS O.TruthfulQA:Measuring HowModels Mimic Human Falsehoods [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Dublin,Ireland,2022.Association for Computational Linguistics.
[32]HUANG K H,CHAN H P,JI H.Zero-shot Faithful FactualError Correction [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Toronto,Canada,2023.Association for Computational Linguistics.
[33]CAI Y,WANG L,WANG Y,et al.MedBench:A Large-ScaleChinese Benchmark for Evaluating Medical Large Language Models [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:17709-17717.

Related Articles 15

[1]	TU Ji, XIAO Wendong, TU Wenji, LI Lijian. Application of Large Language Models in Medical Education:Current Situation,Challenges and Future [J]. Computer Science, 2025, 52(6A): 240400121-6.
[2]	LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7.
[3]	ZOU Rui, YANG Jian, ZHANG Kai. Low-resource Vietnamese Speech Synthesis Based on Phoneme Large Language Model andDiffusion Model [J]. Computer Science, 2025, 52(6A): 240700138-6.
[4]	ZHOU Lei, SHI Huaifeng, YANG Kai, WANG Rui, LIU Chaofan. Intelligent Prediction of Network Traffic Based on Large Language Model [J]. Computer Science, 2025, 52(6A): 241100058-7.
[5]	BAI Yuntian, HAO Wenning, JIN Dawei. Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation [J]. Computer Science, 2025, 52(6A): 240800141-7.
[6]	YIN Baosheng, ZONG Chen. Research on Semantic Fusion of Chinese Polysemous Words Based on Large LanguageModel [J]. Computer Science, 2025, 52(6A): 240400139-7.
[7]	HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4.
[8]	ZHAO Zheyu, WANG Zhongqing, WANG Hongling. Commodity Attribute Classification Method Based on Dual Pre-training [J]. Computer Science, 2025, 52(6A): 240500127-8.
[9]	GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329.
[10]	CHEN Xuhao, HU Sipeng, LIU Hongchao, LIU Boran, TANG Dan, ZHAO Di. Research on LLM Vector Dot Product Acceleration Based on RISC-V Matrix Instruction Set Extension [J]. Computer Science, 2025, 52(5): 83-90.
[11]	CONG Yingnan, HAN Linrui, MA Jiayu, ZHU Jinqing. Research on Intelligent Judgment of Criminal Cases Based on Large Language Models [J]. Computer Science, 2025, 52(5): 248-259.
[12]	ZHU Shucheng, HUO Hongying, WANG Weikang, LIU Ying, LIU Pengyuan. Automatic Optimization and Evaluation of Prompt Fairness Based on Large Language Model Itself [J]. Computer Science, 2025, 52(4): 240-248.
[13]	CHENG Dawei, WU Jiaxuan, LI Jiangtong, DING Zhijun, JIANG Changjun. Study on Evaluation Framework of Large Language Model’s Financial Scenario Capability [J]. Computer Science, 2025, 52(3): 239-247.
[14]	HUANG Xueqin, ZHANG Sheng, ZHU Xianqiang, ZHANG Qianzhen, ZHU Cheng. Generative Task Network:New Paradigm for Autonomic Task Planning and Execution Based on LLM [J]. Computer Science, 2025, 52(3): 248-259.
[15]	SONG Xingnuo, WANG Congyan, CHEN Mingkai. Survey on 3D Scene Reconstruction Techniques in Metaverse [J]. Computer Science, 2025, 52(3): 17-32.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Hallucinations Proactive Relief in Diabetes Q&A LLM

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0