Computer Science ›› 2026, Vol. 53 ›› Issue (6A): 250300034-10.doi: 10.11896/jsjkx.250300034

• Information Security • Previous Articles     Next Articles

Lightweight Network Security Vulnerability Risk Awareness Method Based on RAG

GU Xianjun1, QIN Sihang1, SHU Yifeng2, MA Baoxin2, LIU Feixue2, LIU Ming2   

  1. 1 Wuhan Power Supply Company of State Grid Hubei Electric Power Co.,Ltd.,Wuhan 430010,China
    2 JinYinHu Laboratory,Wuhan 430040,China
  • Online:2026-06-16 Published:2026-06-12
  • About author:GU Xianjun,born in 1981,master,senior engineer.His main research interests include cybersecurity,digital system construction,and artificial intelligence.
    LIU Ming,born in 1999,master,engineer.His main research interests include network and information security,large language models,and AI securi-ty.
  • Supported by:
    Big Data Platform Security Supervision and Governance Technology Project(2022YFB3103400),Attack Detection and Dynamic Security Failure Analysis Instrument for Petrochemical Units(62127808) and Research on Collaborative Malicious Behavior Recognition Based on General Configuration Data in Cyberspace(62172176).

Abstract: In recent years,large model-based network security vulnerability risk awareness has gradually become a research hotspot.However,existing methods still suffer from slow response speed and low semantic quality in terms of intelligent operational efficiency and fine-grained information perception.To address these issues,this paper proposes a lightweight network security vulnerability risk awareness method based on Retrieval-Augmented Generation(RAG).Firstly,by constructing a cross-domain knowledge base and integrating a cross-domain vectorization algorithm,efficient vectorization of network vulnerability information is achieved.Then,a multi-objective retrieval algorithm is designed to adaptively extract highly relevant fine-grained vulnerability information from the knowledge base.Finally,by incorporating metadata-based secondary enhancement and a local large model,intelligent vulnerability risk awareness response is completed.Experimental results show that the proposed method significantly outperforms existing approaches in both semantic quality and operational efficiency,enabling rapid responses to network security risks and providing high-quality countermeasures,fully meeting the demands of intelligent network security operations.

Key words: Vulnerability risk awareness, Retrieval-augmented generation, Penetration testing, Network security, Protection stra-tegy

CLC Number: 

  • TP393
[1] GHIASI M,NIKNAM T,WANG Z,et al.A comprehensive review of cyber-attacks and defense mechanisms for improving security in smart grid energy systems:Past,presentand future[J].Electric Power Systems Research,2023,215:108975.
[2] LIN F,MEI Y,ZHU Y H,et al.A Review of the Whole-Process Impact of Cyber Attacks on Typical Scenarios in Power Systems[J].Southern Power Grid Technology,2023,17(11):61-75.
[3] ALTULAIHAN E A,ALISMAIL A,FRIKHA M.A survey on web application penetration testing[J].Electronics,2023,12(5):1229.
[4] ZHANG J,BU H,WEN H,et al.When llms meet cybersecurity:A systematic literature review[J].arXiv:2405.03644,2024.
[5] YAMIN M M,HASHMI E,ULLAH M,et al.Applications of llms for generating cyber security exercise scenarios[J].IEEE Access,2024(12):143806-143822.
[6] MITRA S,NEUPANE S,CHAKRABORTY T,et al.Localin-tel:Generating organizational threat intelligence from global and local cyber knowledge[J].arXiv:2401.10036,2024.
[7] XIA C S,PALTENGHI M,LE TIAN J,et al.Fuzz4all:Universal fuzzing with large language models[C]//Proceedings of the IEEE/ACM 46th International Conference on Software Engineering.2024:1-13.
[8] SHESTOV A,CHESHKOV A,LEVICHEV R,et al.Finetuning large language models for vulnerability detection[J].arXiv:2401.17010,2024.
[9] KOEHN P,KNOWLES R.Six challenges for neural machinetranslation[J].arXiv:1706.03872,2017.
[10] RAUNAK V,MENEZES A,JUNCZYS-DOWMUNT M.The curious case of hallucinations in neural machine translation[J].arXiv:2104.06683,2021.
[11] MAYNEZ J,NARAYAN S,BOHNET B,et al.On faithfulness and factuality in abstractive summarization[J].arXiv:2005.00661,2020.
[12] JI Z,LEE N,FRIESKE R,et al.Survey of hallucination in natural language generation[J].ACM Computing Surveys,2023,55(12):1-38.
[13] GAO Y,XIONG Y,GAO X,et al.Retrieval-augmented generation for large language models:A survey[J].arXiv:2312.10997,2023.
[14] RAJAPAKSHA S,RANI R,KARAFILI E.A rag-based question-answering solution for cyber-attack investigation and attribution[J].arXiv:2408.06272,2024.
[15] DU X,ZHENG G,WANG K,et al.Vul-rag:Enhancing llm-based vulnerability detection via knowledge-level rag[J].arXiv:2406.11147,2024.
[16] DANESHVAR S S,NONG Y,YANG X,et al.Exploring rag-based vulnerability augmentation with llms[J].arXiv:2408.04125,2024.
[17] XU Y M,HU L,ZHAO J Y,et al.Research Progress and Insights on Large Language Models and Multilingual Intelligence[J].Computer Applications,2023,43(S2):1-8.
[18] KENTON J D M-W C,TOUTANOVA L K.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT,vol.1.Minneapolis,Minnesota,2019.
[19] WANG Y Y.What Can ChatGPT Bring to the Healthcare Industry[J]China Health,2023(4):73-75.
[20] GUO X Y.The ‘Domestic Medical Version of ChatGPT' Amazingly Debuts[J].Chinese Hospital CEO,2023,19(7):24-25.
[21] ZHANG X F,ZHANG L P,YAN S,et al.Personalized Learning Recommendation through the Collaboration of Knowledge Graphs and Large Language Models[J].Computer Applications,1-15.
[22] WANG J K,QIN D H,BAI F B,et al.A Survey on the Integration of Speech Recognition and Large Language Models[J].Computer Engineering and Applications,1-13.
[23] AGHAJANYAN A,ZETTLEMOYER L,GUPTA S.Intrinsic dimensionality explains the effectiveness of language model fine-tuning[J].arXiv:2012.13255,2020.
[24] VILLALOBOS P,SEVILLA J,HEIM L,et al.Will we run out of data? an analysis of the limits of scaling datasets in machine learning[J].arXiv:2211.04325,2022.
[25] WANG L M.Fundamental Issues in the Protection of Sensitive Personal Information-Interpreted in the Context of the Civil Code and the Personal Information Protection Law[J].Contemporary Law,2022,36(1):3-14.
[26] ZHANG L J.The ‘Twenty Data Measures' Released:How to Tap into Data as the ‘New Oil'?[J].China Report,2023(1):66-68.
[27] KANDPAL N,DENG H,ROBERTS A,et al.Large languagemodels struggle to learn long-tail knowledge[C]//International Conference on Machine Learning.PMLR,2023:15696-15707.
[28] LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive nlp tasks[J].Advances in Neural Information Processing Systems,2020,33:9459-9474.
[29] MA X,GONG Y,HE P,et al.Query rewriting for retrieval augmented large language models[J].arXiv:2305.14283,2023.
[30] GLASS M,ROSSIELLO G,CHOWDHURY M F M,et al.Re2g:Retrieve,rerank,generate[J].arXiv:2207.06300,2022.
[31] CHEN J,XIAO S,ZHANG P,et al.Bge m3-embedding:Multi-lingual,multi-functionality,multi-granularity text embeddings through self-knowledge distillation[J].arXiv:2402.03216,2024.
[32] VAN DER MAATEN L,HINTON G.Visualizing data usingt-sne[J].Journal of Machine Learning Research,2008,9(11).
[33] PAN J J,WANG J,LI G.Survey of vector database management systems[J].The VLDB Journal,2024,33(5):1591-1615.
[34] GRATTAFIORI A,DUBEY A,JAUHRI A,et al.The llama 3 herd of models[J].arXiv:2407.21783,2024.
[35] EDGE D,TRINH H,CHENG N,et al.From local to global:A graph rag approach to query-focused summarization[J].arXiv:2404.16130,2024.
[36] LIU Q,SONG J,HUANG Z,et al.glide the,and liunux4odoo,“langchain-chatchat,”[OL].https://github.com/chatchat-space/Langchain-Chatchat,2024.
[37] ZHANG Y P,CHEN M F,TIAN C H,et al.Multi-Strategy Retrieval-Augmented Generation Method for Knowledge-Based Question Answering Systems in the Military Domain[J].Computer Applications,2025,45(3):746-754.
[38] ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical report[J].arXiv:2303.08774,2023.
[1] SHEN Jianwei, CHEN Hanlin, CHEN Xing. Multi-RAG:Distributed Retrieval-augmented Generation Framework for Cross-domain Data [J]. Computer Science, 2026, 53(6A): 250900159-7.
[2] LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[3] BAI Yuntian, HAO Wenning, JIN Dawei. Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation [J]. Computer Science, 2025, 52(6A): 240800141-7.
[4] XIA Zhuoqun, ZHOU Zihao, DENG Bin, KANG Chen. Security Situation Assessment Method for Intelligent Water Resources Network Based on ImprovedD-S Evidence [J]. Computer Science, 2025, 52(6A): 240600051-6.
[5] HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[6] LIANG Jianpeng, MO Xiuliang, WANG Pengxiang, WANG Huanran, WANG Chundong. Research on Malicious Domain Detection Based on Heterogeneous Graph Inductive Learning [J]. Computer Science, 2025, 52(12): 358-366.
[7] MENG Dongyue, HUANG Yuchuan, HAN Guoxiang, LI Hongchen, WANG Pengfei. Research on Emergency Rescue Quadcopter UAV Safety Control Based on Feedforward PID [J]. Computer Science, 2025, 52(11A): 241200203-9.
[8] ZHANG Haoran, HAO Wenning, JIN Dawei, CHENG Kai, ZHAI Ying. DF-RAG:A Retrieval-augmented Generation Method Based on Query Rewriting and Knowledge Selection [J]. Computer Science, 2025, 52(11): 30-39.
[9] CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph [J]. Computer Science, 2025, 52(1): 87-93.
[10] SHANG Yuling, LI Peng, ZHU Feng, WANG Ruchuan. Overview of IoT Traffic Attack Detection Technology Based on Fuzzy Logic [J]. Computer Science, 2024, 51(3): 3-13.
[11] WANG Ziyang, WANG Jia, XIONG Mingliang, WANG Wentao. Intelligent Penetration Path Based on Improved PPO Algorithm [J]. Computer Science, 2024, 51(11A): 231200165-6.
[12] ZENG Qingwei, ZHANG Guomin, XING Changyou, SONG Lihua. Intelligent Attack Path Discovery Based on Hierarchical Reinforcement Learning [J]. Computer Science, 2023, 50(7): 308-316.
[13] REN Gaoke, MO Xiuliang. Network Security Situation Assessment for GA-LightGBM Based on PRF-RFECV Feature Optimization [J]. Computer Science, 2023, 50(6A): 220400151-6.
[14] CHANG Liwei, LIU Xiujuan, QIAN Yuhua, GENG Haijun, LAI Yuping. Multi-source Fusion Network Security Situation Awareness Model Based on Convolutional Neural Network [J]. Computer Science, 2023, 50(5): 382-389.
[15] YANG Xin, LI Hui, QUE Jianming, MA Zhentai, LI Gengxin, YAO Yao, WANG Bin, JIANG Fuli. Efficiently Secure Architecture for Future Network [J]. Computer Science, 2023, 50(3): 360-370.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!