Computer Science ›› 2025, Vol. 52 ›› Issue (1): 42-55.doi: 10.11896/jsjkx.240500095
• Technology Research and Application of Large Language Model • Previous Articles Next Articles
DUN Jingbo, LI Zhuo
CLC Number:
[1]ZHAO W X,ZHOU K,LI J,et al.A survey of large language models[J].arXiv:2303.18223,2023. [2]MCMAHAN B,MOORE E,RAMAGE D,et al.Communica-tion-efficient learning of deep networks from decentralized data[C]//Artificial Intelligence and Statistics.PMLR,2017:1273-1282. [3]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training of deep bidirectional transformers for language understanding[J].arXiv:1810.04805,2018. [4]BEBIS G,GEORGIOPOULOS M.Feed-forward neural net-works[J].IEEE Potentials,1994,13(4):27-31. [5]LUO J H,WU J.Neural network pruning with residual-connections and limited-data[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1458-1467. [6]NAZIR A,WANG Z.A Comprehensive Survey of ChatGPT:Advancements,Applications,Prospects,and Challenges[J].Meta-radiology,2023,1(2):100022. [7]RADFORD A,WU J,CHILD R,et al.Language models are unsupervised multitask learners[J].OpenAI Blog,2019,1(8):9-33. [8]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [9]ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical re-port[J].arXiv:2303.08774,2023. [10]ERHAN D,BINGIO Y,COURVILLE A,et al.Why does unsupervised pre-training help deep learning?[C]//Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics.2010:201-208. [11]SHAHID O,POURIYEH S,PARIZI R M,et al.Communication efficiency in federated learning:Achievements and challenges[J].arXiv:2107.10996,2021. [12]DRIESS D,XIA F,SAJJADI M S M,et al.Palm-e:An embodied multimodal language model[J].arXiv:2303.03378,2023. [13]SUN Y,WANG S,FENG S,et al.Ernie 3.0:Large-scale know-ledge enhanced pre-training for language understanding and ge-neration[J].arXiv:2107.02137,2021. [14]CHEN M,SHLEZINGER N,POOR H V,et al.Communication-efficient federated learning[J].Proceedings of the National Academy of Sciences,2021,118(17):e2024789118. [15]RAJBHANDARI S,RASLEY J,RUWASE O,et al.Zero:Me-mory optimizations toward training trillion parameter models[C]//SC20:International Conference for High Performance Computing,Networking,Storage and Analysis.IEEE,2020:1-16. [16]VM K,WARRIER H,GUPTA Y.Fine Tuning LLM for Enterprise:Practical Guidelines and Recommendations[J].arXiv:2404.10779,2024. [17]CHEN C,FENG X,ZHOU J,et al.Federated large language model:A position paper[J].arXiv:2307.08925,2023. [18]WANG J,LIU Q,LIANG H,et al.A novel framework for the analysis and design of heterogeneous federated learning[J].IEEE Transactions on Signal Processing,2021,69:5234-5249. [19]HOULSBY N,GIURGIU A,JASTRZEBSKI S,et al.Parame-ter-efficient transfer learning for NLP[C]//International Conference on Machine Learning.PMLR,2019:2790-2799. [20]HE R,LIU L,YE H,et al.On the effectiveness of adapter-basedtuning for pretrained language model adaptation[J].arXiv:2106.03164,2021. [21]LI X L,LIANG P.Prefix-tuning:Optimizing continuous promptsfor generation[J].arXiv:2101.00190,2021. [22]LESTER B,AL-RFOU R,CONSTANT N.The power of scale for parameter-efficient prompt tuning[J].arXiv:2104.08691,2021. [23]LIU X,JI K,FU Y,et al.P-tuning:Prompt tuning can be comparable to fine-tuning across scales and tasks[C]//Proceedings of the 60th Annual Meeting of the Association for Computa-tional Linguistics(Volume 2:Short Papers).2022:61-68. [24]HU E J,SHEN Y,WALLIS P,et al.Lora:Low-rank adaptation of large language models[J].arXiv:2106.09685,2021. [25]LIN B Y,HE C,ZENG Z,et al.Fednlp:Benchmarking federated learning methods for natural language processing tasks[J].ar-Xiv:2104.08815,2021. [26]CAI D,WU Y,WANG S,et al.FedAdapter:Efficient Federated Learning for Modern NLP[J].arXiv:2205.10162,2022. [27]BENGIO Y,LOURADOUR J,COLLOBERT R,et al.Curriculum learning[C]//Proceedings of the 26th Annual International Conference on Machine Learning.2009:41-48. [28]KIM G,YOO J,KANG S.Efficient federated learning with pre-trained large language model using several adapter mechanisms[J].Mathematics,2023,11(21):4479. [29]SUN G,MENDIETA M,YANG T,et al.Exploring parameter-efficient fine-tuning for improving communication efficiency in federated learning[J].arXiv:2210.01708,2024. [30]ZHAO H,DU W,LI F,et al.FedPrompt:Communication-Efficient and Privacy-Preserving Prompt Tuning in Federated Learning[C]//ICASSP 2023-2023 IEEE International Confe-rence on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2023:1-5. [31]YANG F E,WANG C Y,WANG Y C F.Efficient model personalization in federated learning via client-specific prompt ge-neration[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2023:19159-19168. [32]CHE T,LIU J,ZHOU Y,et al.Federated learning of large language models with parameter-efficient prompt tuning and adaptive optimization[J].arXiv:2310.15080,2023. [33]YI L,YU H,WANG G,et al.Fedlora:Model-heterogeneouspersonalized federated learning with lora tuning[J].arXiv:2310.13283,2023. [34]JIANG F,DONG L,TU S,et al.Personalized wireless federated learning for large language models[J].arXiv:2404.13238,2024. [35]JIANG J,LIU X,FAN C.Low-parameter federated learningwith large language models[J].arXiv:2307.13896,2023. [36]BABAKNIYA S,ELKORDY A R,EZZELDIN Y H,et al.SLoRA:Federated parameter efficient fine-tuning of language mo-dels[J].arXiv:2308.06522,2023. [37]RAJE A.Communication-Efficient LLM Training for Federated Learning[D].Pittsburgh:Carnegie Mellon University,2024. [38]HUANG W,WANG Y,CHENG A,et al.A Fast,Performant,Secure Distributed Training Framework For LLM[C]//ICASSP 2024-2024 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).IEEE,2024:4800-4804. [39]ZHUANG W,CHEN C,LYU L.When foundation model meets federated learning:Motivations,challenges,and future directions[J].arXiv:2306.15546,2023. [40]REED R.Pruning algorithms-a survey[J].IEEE Transactions on Neural Networks,1993,4(5):740-747. [41]HAN S,POOL J,TRAN J,et al.Learning both weights andconnections for efficient neural network[J].arXiv:1506.02626,2015. [42]FRANTAR E,ALISTARH D.Sparsegpt:Massive languagemodels can be accurately pruned in one-shot[C]//International Conference on Machine Learning.PMLR,2023:10323-10337. [43]LI H,KADAV A,DURDANOVIC I,et al.Pruning Filters for Efficient ConvNets[J].arXiv.1608.08710,2016. [44]JIANG Y,WANG S,VALLS V,et al.Model pruning enables efficient federated learning on edge devices[J].IEEE Transactions on Neural Networks and Learning Systems,2023,34(12):10374-10386. [45]HUANG H,ZHANG L,SUN C,et al.Distributed pruning towards tiny neural networks in federated learning[C]//2023 IEEE 43rd International Conference on Distributed Computing Systems(ICDCS).IEEE,2023:190-201. [46]MA X,FANG G,WANG X.Llm-pruner:On the structuralpruning of large language models[J].Advances in neural information processing systems,2023,36:21702-21720. [47]FRANTAR E,ALISTARH D.Sparsegpt:Massive languagemodels can be accurately pruned in one-shot[C]//International Conference on Machine Learning.PMLR,2023:10323-10337. [48]SUN M,LIU Z,BAIR A,et al.A Simple and Effective Pruning Approach for Large Language Models[J].arXiv:2306.11695,2023. [49]HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015. [50]GOU J,YU B,MAYBANK S J,et al.Knowledge distillation:A survey[J].International Journal of Computer Vision,2021,129(6):1789-1819. [51]ANIL R,PEREYRA G,PASSOS A,et al.Large scale distributed neural network training through online distillation[J].ar-Xiv:1804.03235,2018. [52]WU C,WU F,LYU L,et al.Communication-efficient federated learning via knowledge distillation[J].Nature Communications,2022,13(1):2032. [53]PENG Z,FAN X,CHEN Y,et al.FedPFT:Federated ProxyFine-Tuning of Foundation Models[J].arXiv:2404.11536,2024. [54]WU F J,LI Z T,LI Y L,et al.FedBiOT:LLM Local Fine-tuning in Federated Learning without Full Model[J].arXiv:2406.17706,2024. [55]HAN S,MAO H,DALLY W J.Deep compression:Compressing deep neural networks with pruning,trained quantization and huffman coding[J].arXiv:1510.00149,2015. [56]KIRTAS M,OIKONOMOU A,PASSALIS N,et al.Quantization-aware training for low precision photonic neural networks[J].Neural Networks,2022,155:561-573. [57]LIU Z,OGUZ B,ZHAO C,et al.LLM-QAT:Data-Free Quantization Aware Training for Large Language Models[J].arXiv:2305.17888,2023. [58]REISIZADEH A,MOKHTARI A,HASSANI H,et al.Fedpaq:A communication-efficient federated learning method with periodic averaging and quantization[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2020:2021-2031. [59]CHEN Y,CHEN Z,WU P,et al.FedOBD:Opportunistic block dropout for efficiently training large-scale neural networks through federated learning[J].arXiv:2208.05174,2022. [60]KIM J,LEE J H,KIM S,et al.Memory-efficient fine-tuning of compressed large language models via sub-4-bit integer quantization[J].Advances in Neural Information Processing Systems,2024,36. [61]DETTMERS T,PAGNONI A,HOLTZMAN A,et al.Qlora:Efficient finetuning of quantized llms[C]//Proceedings of the 37th International Conference on Neural Information Processing System.2024:36187-36207. [62]DETTMERS T,LEWIS M,BELKADA Y,et al.Gpt3.int8():8-bit matrix multiplication for transformers at scale[J].Advances in Neural Information Processing Systems,2022,35:30318-30332. [63]LIN J,TANG J,TANG H,et al.AWQ:Activation-awareWeight Quantization for LLM Compression and Acceleration[J].arXiv:2306.00978,2023. [64]BONDARENKO Y,NAGEL M,BLANKEVOORT T.Under-standing and overcoming the challenges of efficient transformer quantization[J].arXiv:2109.12948,2021. [65]WEN Z,YIN W,ZHANG Y.Solving a low-rank factorizationmodel for matrix completion by a nonlinear successive over-relaxation algorithm[J].Mathematical Programming Computation,2012,4(4):333-361. [66]JADERBERG M,VEDALDI A,ZISSERMAN A.Speeding up convolutional neural networks with low rank expansions[J].arXiv:1405.3866,2014. [67]LEBEDEV V,GANIN Y,RAKHUBA M,et al.Speeding-upconvolutional neural networks using fine-tuned cp-decomposition[J].arXiv:1412.6553,2014. [68]WU X,YAO Z,HE Y.Zeroquant-fp:A leap forward in llms post-training w4a8 quantization using floating-point formats[J].arXiv:2307.09782,2023. [69]ZHANG M,SHEN C,YANG Z,et al.Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning[J].arXiv:2305.18403,2023. [70]XU M,CAI D,WU Y,et al.Fwdllm:Efficient fedllm using forward gradient[J].arXiv:2308.13894,2023. [71]QIU Q,CHENG X,SAPIRO G.DCFNet:Deep neural network with decomposed convolutional filters[C]//International Conference on Machine Learning.PMLR,2018:4198-4207. [72]NARAYANAN D,SHOEYBI M,CASPER J,et al.Efficientlarge-scale language model training on gpu clusters using megatron-lm[C]//Proceedings of the International Conference for High Performance Computing,Networking,Storage and Analysis.2021:1-15. [73]JIA Z,ZAHARIA M,AIKEN A.Beyond Data and Model Parallelism for Deep Neural Networks[J].Proceedings of Machine Learning and Systems,2019,1:1-13. [74]ANDROUTSOPOULOS K,CLARK D,HARMAN M,et al.State-based model slicing:A survey[J].ACM Computing Surveys(CSUR),2013,45(4):1-36. [75]SU N,HU C,LI B,et al.TITANIC:Towards Production Fede-rated Learning with Large Language Models[C]//IEEE INFOCOM.2024. [76]SHOEYBI M,PATWARY M,PURI R,et al.Megatron-lm:Training multi-billion parameter language models using model parallelism[J].arXiv:1909.08053,2019. [77]ZHU J,LI S,YOU Y.Sky Computing:Accelerating Geo-distri-buted Computing in Federated Learning[J].arXiv:2202.11836,2022. [78]NAGRECHA K.Systems for parallel and distributed large-mo-del deep learning training[J].arXiv:2301.02691,2023. [79]LI S,ZHAO Y,VARMA R,et al.Pytorch distributed:Expe-riences on accelerating data parallel training[J].arXiv:2006.15704,2020. [80]HUANG Y,CHENG Y,BAPNA A,et al.Gpipe:Efficient trai-ning of giant neural networks using pipeline parallelism[J].Advances in Neural Information Processing Systems,2019,32(10):103-112. [81]HARLAP A,NARAYANAN D,PHANISHAYEE A,et al.Pipedream:Fast and efficient pipeline parallel dnn training[J].arXiv:1806.03377,2018. [82]HE C,LI S,SO J,et al.Fedml:A research library and benchmark for federated machine learning[J].arXiv:2007.13518,2020. [83]FAN T,KANG Y,MA G,et al.FATE-LLM:A IndustrialGrade Federated Learning Framework for Large Language Mo-dels[J].arXiv:2310.10049,2023. [84]KUANG W,QIAN B,LI Z,et al.Federatedscope-llm:A comprehensive package for fine-tuning large language models in fe-derated learning[J].arXiv:2309.00363,2023. [85]YE R,WANG W,CHAI J,et al.OpenFedLLM:Training Large Language Models on Decentralized Private Data via Federated Learning[J].arXiv:2402.06954,2024. [86]YE R,GE R,ZHU X,et al.FedLLM-Bench:Realistic Benchmarks for Federated Learning of Large Language Models[J].arXiv:2406.04845,2024. [87]XIA Q,YE W,TAO Z,et al.A survey of federated learning for edge computing:Research problems and solutions[J].High-Confidence Computing,2021,1(1):100008. [88]ZOU W,LIU X,HOU S,et al.Affinity-Based Resource andTask Allocation in Edge Computing Systems[C]//2023 IEEE 22nd International Conference on Trust,Security and Privacy in Computing and Communications(TrustCom).IEEE,2023. [89]LIU Z,HUANG T,LI B,et al.Epnet++:Cascade bi-direc-tional fusion for multi-modal 3d object detection[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2022,45(7):8324-8341. [90]JI W,WEI Y,ZHENG Z,et al.Deep multimodal learning for information retrieval[C]//Proceedings of the 31st ACM International Conference on Multimedia.2023:9739-9741. [91]LIU F,ZHANG T,DAI W,et al.Few-shot Adaptation of Multi-modal Foundation Models:A Survey[J].arXiv:2401.01736,2024. [92]FARAHANI A,VOGHOEI S,RASHEED K,et al.A brief review of domain adaptation[J].arXiv:2010.03978,2021. [93]PENG L,LUO G,ZHOU S,et al.An in-depth evaluation of fe-derated learning on biomedical natural language processing for information extraction[J].NPJ Digital Medicine,2024,7(1):127. [94]CHEN X,SHI Q,YANG L,et al.ThriftyEdge:Resource-efficient edge computing for intelligent IoT applications[J].IEEE network,2018,32(1):61-65. [95]NGUYEN D C,DING M,PATHIRANA P N,et al.Federated learning for internet of things:A comprehensive survey[J].IEEE Communications Surveys & Tutorials,2021,23(3):1622-1658. |
[1] | LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun. SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models [J]. Computer Science, 2025, 52(1): 72-79. |
[2] | CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph [J]. Computer Science, 2025, 52(1): 87-93. |
[3] | LIU Yuming, DAI Yu, CHEN Gongping. Review of Federated Learning in Medical Image Processing [J]. Computer Science, 2025, 52(1): 183-193. |
[4] | WANG Xin, XIONG Shubo, SUN Lingyun. Federated Graph Learning:Problems,Methods and Challenges [J]. Computer Science, 2025, 52(1): 362-373. |
[5] | XU Jinlong, GUI Zhonghua, LI Jia'nan, LI Yingying, HAN Lin. FP8 Quantization and Inference Memory Optimization Based on MLIR [J]. Computer Science, 2024, 51(9): 112-120. |
[6] | LI Zhi, LIN Sen, ZHANG Qiang. Edge Cloud Computing Approach for Intelligent Fault Detection in Rail Transit [J]. Computer Science, 2024, 51(9): 331-337. |
[7] | LIU Yumeng, ZHAO Yijing, WANG Bicong, WANG Chao, ZHANG Baomin. Advances in SQL Intelligent Synthesis Technology [J]. Computer Science, 2024, 51(7): 40-48. |
[8] | XU Xiaohua, ZHOU Zhangbing, HU Zhongxu, LIN Shixun, YU Zhenjie. Lightweight Deep Neural Network Models for Edge Intelligence:A Survey [J]. Computer Science, 2024, 51(7): 257-271. |
[9] | GAO Yang, CAO Yangjie, DUAN Pengsong. Lightweighting Methods for Neural Network Models:A Review [J]. Computer Science, 2024, 51(6A): 230600137-11. |
[10] | ZHOU Tianyang, YANG Lei. Study on Client Selection Strategy and Dataset Partition in Federated Learning Basedon Edge TB [J]. Computer Science, 2024, 51(6A): 230800046-6. |
[11] | SUN Min, DING Xining, CHENG Qian. Federated Learning Scheme Based on Differential Privacy [J]. Computer Science, 2024, 51(6A): 230600211-6. |
[12] | TAN Zhiwen, XU Ruzhi, WANG Naiyu, LUO Dan. Differential Privacy Federated Learning Method Based on Knowledge Distillation [J]. Computer Science, 2024, 51(6A): 230600002-8. |
[13] | LIU Dongqi, ZHANG Qiong, LIANG Haolan, ZHANG Zidong, ZENG Xiangjun. Study on Smart Grid AMI Intrusion Detection Method Based on Federated Learning [J]. Computer Science, 2024, 51(6A): 230700077-8. |
[14] | WANG Chenzhuo, LU Yanrong, SHEN Jian. Study on Fingerprint Recognition Algorithm for Fairness in Federated Learning [J]. Computer Science, 2024, 51(6A): 230800043-9. |
[15] | ZANG Hongrui, YANG Tingting, LIU Hongbo, MA Kai. Study on Cryptographic Verification of Distributed Federated Learning for Internet of Things [J]. Computer Science, 2024, 51(6A): 230700217-5. |
|