Computer Science ›› 2025, Vol. 52 ›› Issue (6A): 240700182-10.doi: 10.11896/jsjkx.240700182
• Large Language Model Technology and Its Application • Previous Articles Next Articles
ZHANG Le1, CHE Chao1,2, LIANG Yan3
CLC Number:
[1]ZENG A,LIU X,DU Z,et al.GLM-130B:An Open Bilingual Pre-Trained Model [C]//The Eleventh International Conference on Learning Representations,ICLR 2023,Kigali,Rwanda,May 1-5,2023.OpenReview.net,2023. [2]SUN Y,WANG S,FENG S,et al.Ernie 3.0:Large-scale knowledge enhanced pre-training for language understanding and generation [J].arXiv:2107.02137,2021. [3]BAI J,BAI S,CHU Y,et al.Qwen technical report [J].arXiv:2309.16609,2023. [4]TOUVRON H,MARTIN L,STONE K,et al.Llama 2:OpenFoundation and Fine-Tuned Chat Models [J].arXiv:2307.09288,2023. [5]VARSHNEY N,YAO W,ZHANG H,et al.A Stitch in Time Saves Nine:Detecting and Mitigating Hallucinations of LLMs by Validating Low-Confidence Generation [J].arXiv:2307.03987,2023. [6]LI Y,LI Z,ZHANG K,et al.ChatDoctor:A Medical Chat Model Fine-Tuned on a Large Language Model Meta-AI(LLaMA) Using Medical Domain Knowledge [J].Cureus,2023,15(6):1-12. [7]WANG H,LIU C,XI N,et al.Huatuo:Tuning LLaMA Model with Chinese Medical Knowledge [J].arXiv:2304.06975,2023. [8]LIAO Y,MENG Y,LIU H,et al.MING:Chinese Medical Consultation Large Model [EB/OL].(2023-01-01) [2024-07-24].https://github.com/MediaBrain-SJTU/MING. [9]WANG H,ZHAO S,QIANG Z,et al.Knowledge-tuning Large Language Models with Structured Medical Knowledge Bases for Reliable Response Generation in Chinese [J].arXiv:2309.04175,2023. [10]LEE N,PING W,XU P,et al.Factuality enhanced language models for open-ended text generation [J].Advances in Neural Information Processing Systems,2022,35:34586-34599. [11]RASHKIN H,REITTER D,TOMAR G S,et al.IncreasingFaithfulness in Knowledge-Grounded Dialogue with Controllable Features [C]//Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing(Volume 1:Long Papers).Association for Computational Linguistics,2021:704-718. [12]LI Y,YAO K,QIN L,et al.Slot-consistent NLG for task-oriented dialogue systems with iterative rectification network [C]//Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.2020. [13]CHEN S,ZHANG F,SONE K,et al.Improving Faithfulness in Abstractive Summarization with Contrast Candidate Generation and Selection [EB/OL].(2021) [Association for Computational Linguistics].https://aclanthology.org/2021.naacl-main.475. [14]PENG B,GALLEY M,HE P,et al.Check your facts and tryagain:Improving large language models with external knowledge and automated feedback [J].arXiv:2302.12813,2023. [15]MANAKUL P,LIUSIEA,GALES M.SelfCheckGPT:Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Singapore,2023.Association for Computational Linguistics. [16]AZARIA A,AZOULAY R,RECHES S.ChatGPT is a remarkable tool-For experts [J].Data Intelligence,2023,5(4):1-49. [17]LIU X,JI K,FU Y,et al.P-tuning v2:Prompt tuning can be comparable to fine-tuning universally across scales and tasks [J].arXiv:2110.07602,2021. [18]TAORI R,GULRAJANI I,ZHANG T,et al.Alpaca:A strong,replicable instruction-following model [J].Stanford Center for Research on Foundation Models,2023,3(6):7. [19]MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space [C]//1st International Conference on Learning Representations(ICLR 2013).Scottsdale,Arizona,USA,May 2-4,2013,Workshop Track Proceedings.Y.Bengio and Y.LeCun(eds.),2013. [20]DU Z,QIAN Y,LIU X,et al.GLM:General Language Model Pretraining with Autoregressive Blank Infilling [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Dublin,Ireland,2022.Association for Computational Linguistics. [21]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need [J].Advances in Neural Information Processing Systems,2017,30. [22]SHEN J,WU Y,SHANG J,et al.DeepNet:Scaling Transformers to 1,000 Layers [C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2023:10777-10786. [23]SU J,AHMED M,LU Y,et al.Roformer:Enhanced Transformer with Rotary Position Embedding [J].Neurocomputing,2024,568:127063. [24]HENDRYCKS D,GIMPEL K.Gaussian error linear units(gelus) [J].arXiv:1606.08415,2016. [25]HOULSBY N,GIURGIU A,JASTRZEBSKI S,et al.Parameter-efficient transfer learning for NLP [C]//International Conference on Machine Learning.PMLR,2019. [26]LIU X,JI K,FU Y,et al.P-tuning:Prompt tuning can be comparable to fine-tuning across scales and tasks [C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics(Volume 2:Short Papers).2022. [27]ZHANG T,KISHORE V,WU F,et al.BERTScore:Evaluating Text Generation with BERT [C]//8th International Conference on Learning Representations(ICLR 2020).Addis Ababa,Ethiopia,April 26-30,2020.OpenReview.net,2020. [28]LIN C Y.ROUGE:A Package for Automatic Evaluation ofSummaries [C]//Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics.Barcelona,Spain,2004.Association for Computational Linguistics. [29]PAPINENI K,ROUKOS S,WARD T,et al.BLEU:A Method for Automatic Evaluation of Machine Translation [C]//Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics.2002. [30]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners [J].Advances in Neural Information Processing Systems,2020,33:1877-1901. [31]LIN S,HILTON J,EVANS O.TruthfulQA:Measuring HowModels Mimic Human Falsehoods [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Dublin,Ireland,2022.Association for Computational Linguistics. [32]HUANG K H,CHAN H P,JI H.Zero-shot Faithful FactualError Correction [C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing.Toronto,Canada,2023.Association for Computational Linguistics. [33]CAI Y,WANG L,WANG Y,et al.MedBench:A Large-ScaleChinese Benchmark for Evaluating Medical Large Language Models [C]//Proceedings of the AAAI Conference on Artificial Intelligence.2024:17709-17717. |
[1] | TU Ji, XIAO Wendong, TU Wenji, LI Lijian. Application of Large Language Models in Medical Education:Current Situation,Challenges and Future [J]. Computer Science, 2025, 52(6A): 240400121-6. |
[2] | LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7. |
[3] | ZOU Rui, YANG Jian, ZHANG Kai. Low-resource Vietnamese Speech Synthesis Based on Phoneme Large Language Model andDiffusion Model [J]. Computer Science, 2025, 52(6A): 240700138-6. |
[4] | ZHOU Lei, SHI Huaifeng, YANG Kai, WANG Rui, LIU Chaofan. Intelligent Prediction of Network Traffic Based on Large Language Model [J]. Computer Science, 2025, 52(6A): 241100058-7. |
[5] | BAI Yuntian, HAO Wenning, JIN Dawei. Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation [J]. Computer Science, 2025, 52(6A): 240800141-7. |
[6] | YIN Baosheng, ZONG Chen. Research on Semantic Fusion of Chinese Polysemous Words Based on Large LanguageModel [J]. Computer Science, 2025, 52(6A): 240400139-7. |
[7] | HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4. |
[8] | ZHAO Zheyu, WANG Zhongqing, WANG Hongling. Commodity Attribute Classification Method Based on Dual Pre-training [J]. Computer Science, 2025, 52(6A): 240500127-8. |
[9] | GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329. |
[10] | CHEN Xuhao, HU Sipeng, LIU Hongchao, LIU Boran, TANG Dan, ZHAO Di. Research on LLM Vector Dot Product Acceleration Based on RISC-V Matrix Instruction Set Extension [J]. Computer Science, 2025, 52(5): 83-90. |
[11] | CONG Yingnan, HAN Linrui, MA Jiayu, ZHU Jinqing. Research on Intelligent Judgment of Criminal Cases Based on Large Language Models [J]. Computer Science, 2025, 52(5): 248-259. |
[12] | ZHU Shucheng, HUO Hongying, WANG Weikang, LIU Ying, LIU Pengyuan. Automatic Optimization and Evaluation of Prompt Fairness Based on Large Language Model Itself [J]. Computer Science, 2025, 52(4): 240-248. |
[13] | CHENG Dawei, WU Jiaxuan, LI Jiangtong, DING Zhijun, JIANG Changjun. Study on Evaluation Framework of Large Language Model’s Financial Scenario Capability [J]. Computer Science, 2025, 52(3): 239-247. |
[14] | HUANG Xueqin, ZHANG Sheng, ZHU Xianqiang, ZHANG Qianzhen, ZHU Cheng. Generative Task Network:New Paradigm for Autonomic Task Planning and Execution Based on LLM [J]. Computer Science, 2025, 52(3): 248-259. |
[15] | SONG Xingnuo, WANG Congyan, CHEN Mingkai. Survey on 3D Scene Reconstruction Techniques in Metaverse [J]. Computer Science, 2025, 52(3): 17-32. |
|