Computer Science ›› 2025, Vol. 52 ›› Issue (11): 30-39.doi: 10.11896/jsjkx.241000117
• Research and Application of Large Language Model Technology • Previous Articles Next Articles
ZHANG Haoran, HAO Wenning, JIN Dawei, CHENG Kai, ZHAI Ying
CLC Number:
| [1]TOM B,BENJAMIN M,et al.Language models are few-shot learners[J].Advances in Neural Information Processing Systems,2020,33:1877-1901. [2]JOSH A,STEVEN A,SAND H A,et al.GPT-4 Technical Report[R].OpenAI,2023. [3]ADAM R,COLIN R,NOAM S.How Much Knowledge CanYou Pack Into the Parameters of a Language Model[C]//2020 Association for Computational Linguistics.2020:5418-5426. [4]ORI R,YOAV L,ITAY D,et al.In-context retrieval-augmented language models[J].Transactions of the Association for Computational Linguistics,2023,11:1316-1331. [5]ALEX M,AKARI A,VICTOR Z.When Not to Trust Language Models:Investigating Effectiveness of Parametric and Non-Parametric Memories[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:9802-9822. [6]MA X B,GONG Y Y,HE P C,et al.Query Rewriting for Retrieval-Augmented Large Language Models[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023:5303-5315. [7]AKARI A,SEWON M,ZHONG Z X.Self-RAG:Learning to Retrieve,Generate,and Critique through Self-Reflection[C]//Twelfth International Conference on Learning Representations.2023:41-46. [8]YU W H,ZHANG H M,PAN X M,et al.Chain-of-Note:Enhancing Robustness in Retrieval-Augmented Language Models[J].arXiv:2307.09288,2023. [9]HUGO T,LOUIS M,KEVIN S,et al.Llama 2:Open foundation and fine-tuned chat models[J].arXiv:2307.09288,2023. [10]ZHAO W Y X,ZHOU K,LI J Y,et al.A Survey on Evaluation of Large Language Models[J].Association for Computing Machinery,2024,39:2157-6904. [11]KURT S,SPENCER P,CHEN M Y,et al.Retrieval Augmentation Reduces Hallucination in Conversation[C]//Empirical Methods in Natural Language Processing 2021.2021:3784-3803. [12]XU X C,GOU Z B,WU W Q,et al.Long Time No See! Open-Domain Conversation with Long-Term Persona Memory[C]//Findings of the Association for Computational Linguistics:ACL 2022.2022:2639-2650. [13]WANG H R,HU M D,DENG Y,et al.Large Language Models as Source Planner for Personalized Knowledge-grounded Dialogues[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:9556-9569. [14]PATRICK L,ETHAN P,ALEKSANDRA P,et al.Retrieval-augmented generation for knowledge-intensive NLP tasks[J].Advances in Neural Information Processing Systems,2020,33:9459-9474. [15]CHENG D X,HUANG S H,BI J Y,et al.UPRISE:Universal Prompt Retrieval for Improving Zero-Shot Evaluation[C]//Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing.2023:12318-12337. [16]GAO L Y,MA X G,LIN J M,et al.Precise Zero-Shot Dense Retrieval without Relevance Labels[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:1762-1777. [17]SHI W J,SEWON M,MICHHIRO Y,et al.REPLUG:Retrie-val-Augmented Black-Box Language Models[C]//Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2024:8371-8384. [18]OUYANG L,JEFFREY W,XU J,et al.Training language mo-dels to follow instructions with human feedback[C]//Advances in Neural Information Processing Systems 35:Annual Confe-rence on Neural Information Processing Systems 2022.2022:27730-27744. [19]JOHN S,FILIP W,PRAFULLA D,et al.Proximal Policy Optimization Algorithms[R].OpenAI,2023. [20]WU Z Q,HU Y S,SHI W J,et al.Fine-Grained Human Feedback Gives Better Rewards for Language Model Training[C]//37th Conference on Neural Information Processing Systems.2023:918-944. [21]XU W W,CAI D,ZHANG Z S,et al.Reasons to Reject? Aligning Language Models with Judgments[R].Tencent AI Lab,2023. [22]COLIN R,NOAM S,ADAM R,et al.Exploring the limits of transfer learning with a unified text-to-text transformer[J].The Journal of Machine Learning Research,2020,67:1532-4435. [23]YE F H,FANG M,LI S H,et al.Enhancing Conversational Search:Large Language Model-Aided Informative Query Rewriting[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:5985-6006. [24]RAJKUMAR R,PRITHVIRAJ A,KIANTÉ B,et al.Is reinforcement learning(not) for natural language processing:Benchmarks,baselines,and building blocks for natural language policy optimization[C]//The Eleventh International Conference on Learning Representations.2023:5687-5711. [25]JOHN S,PHILIPP M,SERGEY L,et al.High-dimensional continuous control using generalized advantage estimation[C]//4th International Conference on Learning Representations.2016:571-584. [26]ZHENG R,DOU S,GAO S Y,et al.Secrets of RLHF in large language models part1:PPO[R].Fudan NLP Group,2023. [27]SAURAV K,CONERLY T,ASKELL A,et al.Language models(mostly) know what they know[J].Transactions of the Association for Computational Linguistics,2022,8:423-438. [28]NOGUEIRA R,JIANG Z Y,PRADEEP R,et al.Documentranking with a pretrained sequence-to-sequence model[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.2020:708-718. [29]TOM K,JENNIMARIA P,OLIVIA R,et al.Natural Ques-tions:A Benchmark for Question Answering Research[C]//Transactions of the Association for Computational Linguistics.2019:452-466. [30]ALEX M,AKARI A,VICTOR Z,et al.When Not to Trust Language Models:Investigating Effectiveness of Parametric and Non-Parametric Memories[C]//Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics.2023:9802-9822. [31]MANDAR J,EUNSOL C,DANIEL W,et al.TriviaQA:ALarge Scale Distantly Supervised Challenge Dataset for Reading Comprehension[C]//Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics.2017:1601-1611. [32]YANG Z L,QI P,ZHANG S Z,et al.Manning.HotpotQA:A Dataset for Diverse,Explainable Multi-hop Question Answering[C]//Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing.2018:2369-2380. [33]HARSH T,NIRANJAN B,TUSHAR K,et al.Interleaving retrieval with chain-of-thought reasoning for knowledge-intensive multi-step questions[C]//2023 Association for Computational Linguistics.2023:10014-10037. [34]WANG Y L,LI P,SUN M S,et al.Self-knowledge guided retrieval augmentation for large language models[C]//Findings of the Association for Computational Linguistics:EMNLP 2023.2023:10303-10315. [35]MAO S Y,JIANG Y,CHEN B L,et al.RaFe:Ranking Feedback Improves Query Rewriting for RAG[C]//Findings of the Association for Computational Linguistics:EMNLP 2024.2024:884-901. [36]YU W H,DAN I,WANG S H,et al.Generate rather than Retrieve:Large Language Models are Strong Context Generators[C]//International Conference on Learning Representations.2023. [37]LIU Y M,PENG X Y,ZHANG X H,et al.RA-ISF:Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback[C]//Association for Computational Linguistics.2024:4730-4749. [38]KISEUNG K,JAY-YOON L.RE-RAG:Improving Open-Do-main QA Performance and Interpretability with Relevance Estimator in Retrieval-Augmented Generation[C]//Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing.2024:22149-22161. [39]VLADIMIR K,BARLAS O,SEWON M,et al.Dense passageretrieval for open-domain question answering[C]//Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing.2020: 6769-6781. [40]GAUTIER I,PATRICK L,MARIA L,et al.Atlas:Few-shot learning with retrieval augmented language models[J].Journal of Machine Learning Research,2023,24:1-43. |
| [1] | LI Junwen, SONG Yuqiu, ZHANG Weiyan, RUAN Tong, LIU Jingping, ZHU Yan. Cross-lingual Information Retrieval Based on Aligned Query [J]. Computer Science, 2025, 52(8): 259-267. |
| [2] | LI Maolin, LIN Jiajie, YANG Zhenguo. Confidence-guided Prompt Learning for Multimodal Aspect-level Sentiment Analysis [J]. Computer Science, 2025, 52(7): 241-247. |
| [3] | CHEN Jinyin, XI Changkun, ZHENG Haibin, GAO Ming, ZHANG Tianxin. Survey of Security Research on Multimodal Large Language Models [J]. Computer Science, 2025, 52(7): 315-341. |
| [4] | LI Bo, MO Xian. Application of Large Language Models in Recommendation System [J]. Computer Science, 2025, 52(6A): 240400097-7. |
| [5] | BAI Yuntian, HAO Wenning, JIN Dawei. Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation [J]. Computer Science, 2025, 52(6A): 240800141-7. |
| [6] | HU Caishun. Study on Named Entity Recognition Algorithms in Audit Domain Based on Large LanguageModels [J]. Computer Science, 2025, 52(6A): 240700190-4. |
| [7] | GAO Hongkui, MA Ruixiang, BAO Qihao, XIA Shaojie, QU Chongxiao. Research on Hybrid Retrieval-augmented Dual-tower Model [J]. Computer Science, 2025, 52(6): 324-329. |
| [8] | PAN Jie, WANG Juan, WANG Nan. Large Language Models and Rumors:A Survey on Generation and Detection [J]. Computer Science, 2025, 52(11): 1-12. |
| [9] | FANG Quan, ZHANG Jinlong, WANG Bingqian, HU Jun. Research on Domain Knowledge Question Answering via Large Language Models withCompositional Context Prompting [J]. Computer Science, 2025, 52(11): 13-21. |
| [10] | ZHOU Yuchen, LI Peng, HAN Keji. Instruct-Malware:Control Flow Graph Based Large Language Model Analysis of Malware [J]. Computer Science, 2025, 52(11): 40-48. |
| [11] | CHEN Yuyan, JIA Jiyuan, CHANG Jingwen, ZUO Kaiwen, XIAO Yanghua. SPEAKSMART:Evaluating Empathetic Persuasive Responses by Large Language Models [J]. Computer Science, 2025, 52(10): 217-230. |
| [12] | DUN Jingbo, LI Zhuo. Survey on Transmission Optimization Technologies for Federated Large Language Model Training [J]. Computer Science, 2025, 52(1): 42-55. |
| [13] | LI Tingting, WANG Qi, WANG Jiakang, XU Yongjun. SWARM-LLM:An Unmanned Swarm Task Planning System Based on Large Language Models [J]. Computer Science, 2025, 52(1): 72-79. |
| [14] | CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph [J]. Computer Science, 2025, 52(1): 87-93. |
| [15] | LIU Yumeng, ZHAO Yijing, WANG Bicong, WANG Chao, ZHANG Baomin. Advances in SQL Intelligent Synthesis Technology [J]. Computer Science, 2024, 51(7): 40-48. |
|
||