Computer Science ›› 2026, Vol. 53 ›› Issue (3): 33-40.doi: 10.11896/jsjkx.250600073
• Intelligent Information System Based on AGI Technology • Previous Articles Next Articles
ZHOU Yueyuan, LU Guanze, XIANG Jiawei, ZHANG Jiawei, SHAO En, HE Xin
CLC Number:
| [1]HYGON.DCU[EB/OL].https://www.hygon.cn/product/accelerator. [2]NVIDIA Corporation.CUDA Toolkit - Free Tools and Training[EB/OL].https://developer.nvidia.com/cuda-toolkit. [3]HYGON.PyTorch[EB/OL].https://das.sourcefind.cn:55011/portal/#/installation?id=04749079-6b33-11ef-b472-005056904552&type=frame. [4]HYGON.PyTorch[EB/OL].https://download.sourcefind.cn:65024/1/main/DTK-24.04.3/Document/DTK24.04.3开发环境使用手册.pdf. [5]SHOEYBI M,PATWARY M,PURI R,et al.Megatron-LM:Training Multi-Billion Parameter Language Models Using GPU Model Parallelism[J].arXiv:1909.08053,2019. [6]RAJBHANDARI S,RASLEY J,RUWASE O,et al.ZeRO:Memory optimizations Toward Training Trillion Parameter Models[J].arXiv:1910.02054,2019. [7]BAINES M,BHOSALE S,CAGGIANO V,et al.Fairscale:Ageneral purpose modular pytorch library for high performance and large scale training[EB/OL].https://github.com/facebookresearch/fairscale. [8]LIU Z,OGUZ B,ZHAO C,et al.Llm-qat:Data-free quantization aware training for large language models[C]//Findings of the Association for Computational Linguistics:ACL 2024.2024:467-484. [9]DU D,ZHANG Y,CAO S,et al.Bitdistiller:Unleashing the potential of sub-4-bit llms via self-distillation[J].arXiv:2402.10631,2024. [10]YAO Z,WU X,LI C,et al.Zeroquant-v2:Exploring post-training quantization in llms from comprehensive study to low rank compensation[J].arXiv:2303.08302,2023. [11]FRANKLE J,CARBIN M.The lottery ticket hypothesis:Fin-ding sparse,trainable neural networks[J].arXiv:1803.03635,2018. [12]FRANTAR E,ALISTARH D.Sparsegpt:Massive languagemodels can be accurately prunedin one-shot[C]//International Conference on Machine Learning.PMLR,2023:10323-10337. [13]HOOPER C,KIM S,MOHAMMADZADEH H,et al.Kvquant:Towards 10 million context length llm inference with kv cache quantization[J].Advances in Neural Information Processing Systems,2024,37:1270-1303. [14]YUE Y,YUAN Z,DUANMU H,et al.Wkvquant:Quantizing weight and key/value cache for large language models gains more[J].arXiv:2402.12065,2024. [15]KIM B K,KIM G,KIM T H,et al.Shortened llama:A simple depth pruning for large language models[J].arXiv:2402.02834,2024. [16]PENG B,QUESNELLE J,ROLNICK D,et al.APreliminaryREPORT ON DISTRO[EB/OL].https://assets.ctfassets.net/jdtwqhzvc2n1/pxVc7MpQSQS7sNGcY7Unk/0328a24ade597df12cb0cde094a9af5d/A_Preliminary_Report_on_DisTrO.pdf. [17]WANG S,WEI J,SABNE A,et al.Overlap communication with dependent computation via decomposition in large deep learning models[C]//Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems.2022:93-106. [18]HUANG Y,CHENG Y,BAPNA A,et al.Gpipe:Easy scalingwith micro-batch pipel ine parallelism[C]//Proceeding of Computer Vision and Pattern Recognition.2019. [19]ZHANG J,SHAO E,WANG L,et al.AsymFB:AcceleratingLLM Training Through Asymmetric Model Parallelism[C]//IFIP International Conference on Network and Parallel Computing.Singapore:Springer,2024:16-27. [20]QI P,WAN X,HUANG G,et al.Zero bubble pipeline paralle-lism[J].arXiv:2401.10241,2024. [21]CHEN T,XU B,ZHANG C,et al.Training deep nets with sublinear memory cost[J].arXiv:1604.06174,2016. [22]KIRISAME M,LYUBOMIRSKY S,HAAN A,et al.Dynamictensor rematerialization[J].arXiv:2006.09616,2020. [23]ANDOORVEEDU M,ZHU Z,ZHENG B,et al.Tempo:Acce-lerating transformer-based model training through memory footprint reduction[J].Advances in Neural Information Processing Systems,2022,35:12267-12282. [24]ZHANG S,ROLLER S,GOYAL N,et al.Opt:Open pre-trained transformer language models[J].arXiv:2205.01068,2022. [25]TOUVRON H,LAVRIL T,IZACARD G,et al.Llama:Openand efficient foundation language models[J].arXiv:2302.13971,2023. [26]WORKSHOP B S,SCAO T L,FAN A,et al.Bloom:A 176b-parameter open-access multilingual language model[J].arXiv:2211.05100,2022. |
| [1] | WANG Zhibin, LI Shipeng, ZHOU Yuhang, LI Xue, ZHANG Zhonghui, JIANG Zhiwei, GU Rong, TIAN Chen, CHEN Guihai, ZHONG Sheng. Optimization of Service Level Objectives and System Level Metrics in Large Language ModelServing System [J]. Computer Science, 2026, 53(3): 23-32. |
| [2] | CHEN Han, XU Zefeng, JIANG Jiu, FAN Fan, ZHANG Junjian, HE Chu, WANG Wenwei. Large Language Model and Deep Network Based Cognitive Assessment Automatic Diagnosis [J]. Computer Science, 2026, 53(3): 41-51. |
| [3] | WU Xianjie, LI Tongliang, LI Zhoujun. Survey of Table Question Answering Research [J]. Computer Science, 2026, 53(3): 295-306. |
| [4] | XU Cheng, LIU Yuxuan, WANG Xin, ZHANG Cheng, YAO Dengfeng, YUAN Jiazheng. Review of Speech Disorder Assessment Methods Driven by Large Language Models [J]. Computer Science, 2026, 53(3): 307-320. |
| [5] | LI Wenli, FENG Xiaonian, QIAN Tieyun. Few-shot Continuous Toxicity Detection Based on Large Language Model Augmentation [J]. Computer Science, 2026, 53(3): 321-330. |
| [6] | CHEN Yuyin, LI Guanfeng, QIN Jing, XIAO Yuhang. Survey on Complex Logical Query Methods in Knowledge Graphs [J]. Computer Science, 2026, 53(2): 273-288. |
| [7] | GUO Luxiang, WANG Yueyu, LI Qianyue, LI Shasha, LIU Xiaodong, JI Bin, YU Jie. Comprehensive Survey of LLM-based Agent Operating Systems [J]. Computer Science, 2026, 53(1): 1-11. |
| [8] | LIU Lilong, LIU Guoming, QI Baoyuan, DENG Xueshan, XUE Dizhan, QIAN Shengsheng. Efficient Inference Techniques of Large Models in Real-world Applications:A Comprehensive Survey [J]. Computer Science, 2026, 53(1): 12-28. |
| [9] | SHAO Xinyi, ZHU Jingwei, ZHANG Liang. LLM-based Business Process Adaptation Method to Respond Long-tailed Changes [J]. Computer Science, 2026, 53(1): 29-38. |
| [10] | LIU Leyuan, CHEN Gege, WU Wei, WANG Yong, ZHOU Fan. Survey of Data Classification and Grading Studies [J]. Computer Science, 2025, 52(9): 195-211. |
| [11] | CAI Qihang, XU Bin, DONG Xiaodi. Knowledge Graph Completion Model Using Semantically Enhanced Prompts and Structural Information [J]. Computer Science, 2025, 52(9): 282-293. |
| [12] | ZHONG Boyang, RUAN Tong, ZHANG Weiyan, LIU Jingping. Collaboration of Large and Small Language Models with Iterative Reflection Framework for Clinical Note Summarization [J]. Computer Science, 2025, 52(9): 294-302. |
| [13] | WANG Limei, HAN Linrui, DU Zuwei, ZHENG Ri, SHI Jianzhong, LIU Yiqun. Privacy Policy Compliance Detection Method for Mobile Application Based on Large LanguageModel [J]. Computer Science, 2025, 52(8): 1-16. |
| [14] | WANG Dongsheng. Multi-defendant Legal Judgment Prediction with Multi-turn LLM and Criminal Knowledge Graph [J]. Computer Science, 2025, 52(8): 308-316. |
| [15] | LI Maolin, LIN Jiajie, YANG Zhenguo. Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis [J]. Computer Science, 2025, 52(7): 241-247. |
|
||