基于RAG的轻量级网络安全漏洞风险感知方法

doi:10.11896/jsjkx.250300034

Abstract

Abstract: In recent years,large model-based network security vulnerability risk awareness has gradually become a research hotspot.However,existing methods still suffer from slow response speed and low semantic quality in terms of intelligent operational efficiency and fine-grained information perception.To address these issues,this paper proposes a lightweight network security vulnerability risk awareness method based on Retrieval-Augmented Generation(RAG).Firstly,by constructing a cross-domain knowledge base and integrating a cross-domain vectorization algorithm,efficient vectorization of network vulnerability information is achieved.Then,a multi-objective retrieval algorithm is designed to adaptively extract highly relevant fine-grained vulnerability information from the knowledge base.Finally,by incorporating metadata-based secondary enhancement and a local large model,intelligent vulnerability risk awareness response is completed.Experimental results show that the proposed method significantly outperforms existing approaches in both semantic quality and operational efficiency,enabling rapid responses to network security risks and providing high-quality countermeasures,fully meeting the demands of intelligent network security operations.

Key words: Vulnerability risk awareness, Retrieval-augmented generation, Penetration testing, Network security, Protection stra-tegy

CLC Number:

TP393

GU Xianjun, QIN Sihang, SHU Yifeng, MA Baoxin, LIU Feixue, LIU Ming. Lightweight Network Security Vulnerability Risk Awareness Method Based on RAG[J].Computer Science, 2026, 53(6A): 250300034-10.

References

[1] GHIASI M,NIKNAM T,WANG Z,et al.A comprehensive review of cyber-attacks and defense mechanisms for improving security in smart grid energy systems:Past,presentand future[J].Electric Power Systems Research,2023,215:108975.
[2] LIN F,MEI Y,ZHU Y H,et al.A Review of the Whole-Process Impact of Cyber Attacks on Typical Scenarios in Power Systems[J].Southern Power Grid Technology,2023,17(11):61-75.
[3] ALTULAIHAN E A,ALISMAIL A,FRIKHA M.A survey on web application penetration testing[J].Electronics,2023,12(5):1229.
[4] ZHANG J,BU H,WEN H,et al.When llms meet cybersecurity:A systematic literature review[J].arXiv:2405.03644,2024.
[5] YAMIN M M,HASHMI E,ULLAH M,et al.Applications of llms for generating cyber security exercise scenarios[J].IEEE Access,2024(12):143806-143822.
[6] MITRA S,NEUPANE S,CHAKRABORTY T,et al.Localin-tel:Generating organizational threat intelligence from global and local cyber knowledge[J].arXiv:2401.10036,2024.
[7] XIA C S,PALTENGHI M,LE TIAN J,et al.Fuzz4all:Universal fuzzing with large language models[C]//Proceedings of the IEEE/ACM 46th International Conference on Software Engineering.2024:1-13.
[8] SHESTOV A,CHESHKOV A,LEVICHEV R,et al.Finetuning large language models for vulnerability detection[J].arXiv:2401.17010,2024.
[9] KOEHN P,KNOWLES R.Six challenges for neural machinetranslation[J].arXiv:1706.03872,2017.
[10] RAUNAK V,MENEZES A,JUNCZYS-DOWMUNT M.The curious case of hallucinations in neural machine translation[J].arXiv:2104.06683,2021.
[11] MAYNEZ J,NARAYAN S,BOHNET B,et al.On faithfulness and factuality in abstractive summarization[J].arXiv:2005.00661,2020.
[12] JI Z,LEE N,FRIESKE R,et al.Survey of hallucination in natural language generation[J].ACM Computing Surveys,2023,55(12):1-38.
[13] GAO Y,XIONG Y,GAO X,et al.Retrieval-augmented generation for large language models:A survey[J].arXiv:2312.10997,2023.
[14] RAJAPAKSHA S,RANI R,KARAFILI E.A rag-based question-answering solution for cyber-attack investigation and attribution[J].arXiv:2408.06272,2024.
[15] DU X,ZHENG G,WANG K,et al.Vul-rag:Enhancing llm-based vulnerability detection via knowledge-level rag[J].arXiv:2406.11147,2024.
[16] DANESHVAR S S,NONG Y,YANG X,et al.Exploring rag-based vulnerability augmentation with llms[J].arXiv:2408.04125,2024.
[17] XU Y M,HU L,ZHAO J Y,et al.Research Progress and Insights on Large Language Models and Multilingual Intelligence[J].Computer Applications,2023,43(S2):1-8.
[18] KENTON J D M-W C,TOUTANOVA L K.Bert:Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of naacL-HLT,vol.1.Minneapolis,Minnesota,2019.
[19] WANG Y Y.What Can ChatGPT Bring to the Healthcare Industry[J]China Health,2023(4):73-75.
[20] GUO X Y.The ‘Domestic Medical Version of ChatGPT' Amazingly Debuts[J].Chinese Hospital CEO,2023,19(7):24-25.
[21] ZHANG X F,ZHANG L P,YAN S,et al.Personalized Learning Recommendation through the Collaboration of Knowledge Graphs and Large Language Models[J].Computer Applications,1-15.
[22] WANG J K,QIN D H,BAI F B,et al.A Survey on the Integration of Speech Recognition and Large Language Models[J].Computer Engineering and Applications,1-13.
[23] AGHAJANYAN A,ZETTLEMOYER L,GUPTA S.Intrinsic dimensionality explains the effectiveness of language model fine-tuning[J].arXiv:2012.13255,2020.
[24] VILLALOBOS P,SEVILLA J,HEIM L,et al.Will we run out of data? an analysis of the limits of scaling datasets in machine learning[J].arXiv:2211.04325,2022.
[25] WANG L M.Fundamental Issues in the Protection of Sensitive Personal Information－Interpreted in the Context of the Civil Code and the Personal Information Protection Law[J].Contemporary Law,2022,36(1):3-14.
[26] ZHANG L J.The ‘Twenty Data Measures' Released:How to Tap into Data as the ‘New Oil'?[J].China Report,2023(1):66-68.
[27] KANDPAL N,DENG H,ROBERTS A,et al.Large languagemodels struggle to learn long-tail knowledge[C]//International Conference on Machine Learning.PMLR,2023:15696-15707.
[28] LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive nlp tasks[J].Advances in Neural Information Processing Systems,2020,33:9459-9474.
[29] MA X,GONG Y,HE P,et al.Query rewriting for retrieval augmented large language models[J].arXiv:2305.14283,2023.
[30] GLASS M,ROSSIELLO G,CHOWDHURY M F M,et al.Re2g:Retrieve,rerank,generate[J].arXiv:2207.06300,2022.
[31] CHEN J,XIAO S,ZHANG P,et al.Bge m3-embedding:Multi-lingual,multi-functionality,multi-granularity text embeddings through self-knowledge distillation[J].arXiv:2402.03216,2024.
[32] VAN DER MAATEN L,HINTON G.Visualizing data usingt-sne[J].Journal of Machine Learning Research,2008,9(11).
[33] PAN J J,WANG J,LI G.Survey of vector database management systems[J].The VLDB Journal,2024,33(5):1591-1615.
[34] GRATTAFIORI A,DUBEY A,JAUHRI A,et al.The llama 3 herd of models[J].arXiv:2407.21783,2024.
[35] EDGE D,TRINH H,CHENG N,et al.From local to global:A graph rag approach to query-focused summarization[J].arXiv:2404.16130,2024.
[36] LIU Q,SONG J,HUANG Z,et al.glide the,and liunux4odoo,“langchain-chatchat,”[OL].https://github.com/chatchat-space/Langchain-Chatchat,2024.
[37] ZHANG Y P,CHEN M F,TIAN C H,et al.Multi-Strategy Retrieval-Augmented Generation Method for Knowledge-Based Question Answering Systems in the Military Domain[J].Computer Applications,2025,45(3):746-754.
[38] ACHIAM J,ADLER S,AGARWAL S,et al.Gpt-4 technical report[J].arXiv:2303.08774,2023.

Related Articles 15

[1]	SHEN Jianwei, CHEN Hanlin, CHEN Xing. Multi-RAG:Distributed Retrieval-augmented Generation Framework for Cross-domain Data [J]. Computer Science, 2026, 53(6A): 250900159-7.
[2]	LIU Suyi, LIU Qi, GAO Weibo. Agent4Stu:Efficient LLM-based Student Answer Behavior Simulation Agent [J]. Computer Science, 2026, 53(4): 347-355.
[3]	BAI Yuntian, HAO Wenning, JIN Dawei. Study on Open-domain Question Answering Methods Based on Retrieval-augmented Generation [J]. Computer Science, 2025, 52(6A): 240800141-7.
[4]	XIA Zhuoqun, ZHOU Zihao, DENG Bin, KANG Chen. Security Situation Assessment Method for Intelligent Water Resources Network Based on ImprovedD-S Evidence [J]. Computer Science, 2025, 52(6A): 240600051-6.
[5]	HUO Xingpeng, SHA Letian, LIU Jianwen, WU Shang, SU Ziyue. Windows Domain Penetration Testing Attack Path Generation Based on Deep Reinforcement Learning [J]. Computer Science, 2025, 52(3): 400-406.
[6]	LIANG Jianpeng, MO Xiuliang, WANG Pengxiang, WANG Huanran, WANG Chundong. Research on Malicious Domain Detection Based on Heterogeneous Graph Inductive Learning [J]. Computer Science, 2025, 52(12): 358-366.
[7]	MENG Dongyue, HUANG Yuchuan, HAN Guoxiang, LI Hongchen, WANG Pengfei. Research on Emergency Rescue Quadcopter UAV Safety Control Based on Feedforward PID [J]. Computer Science, 2025, 52(11A): 241200203-9.
[8]	ZHANG Haoran, HAO Wenning, JIN Dawei, CHENG Kai, ZHAI Ying. DF-RAG:A Retrieval-augmented Generation Method Based on Query Rewriting and Knowledge Selection [J]. Computer Science, 2025, 52(11): 30-39.
[9]	CHENG Zhiyu, CHEN Xinglin, WANG Jing, ZHOU Zhongyuan, ZHANG Zhizheng. Retrieval-augmented Generative Intelligence Question Answering Technology Based on Knowledge Graph [J]. Computer Science, 2025, 52(1): 87-93.
[10]	SHANG Yuling, LI Peng, ZHU Feng, WANG Ruchuan. Overview of IoT Traffic Attack Detection Technology Based on Fuzzy Logic [J]. Computer Science, 2024, 51(3): 3-13.
[11]	WANG Ziyang, WANG Jia, XIONG Mingliang, WANG Wentao. Intelligent Penetration Path Based on Improved PPO Algorithm [J]. Computer Science, 2024, 51(11A): 231200165-6.
[12]	ZENG Qingwei, ZHANG Guomin, XING Changyou, SONG Lihua. Intelligent Attack Path Discovery Based on Hierarchical Reinforcement Learning [J]. Computer Science, 2023, 50(7): 308-316.
[13]	REN Gaoke, MO Xiuliang. Network Security Situation Assessment for GA-LightGBM Based on PRF-RFECV Feature Optimization [J]. Computer Science, 2023, 50(6A): 220400151-6.
[14]	CHANG Liwei, LIU Xiujuan, QIAN Yuhua, GENG Haijun, LAI Yuping. Multi-source Fusion Network Security Situation Awareness Model Based on Convolutional Neural Network [J]. Computer Science, 2023, 50(5): 382-389.
[15]	YANG Xin, LI Hui, QUE Jianming, MA Zhentai, LI Gengxin, YAO Yao, WANG Bin, JIANG Fuli. Efficiently Secure Architecture for Future Network [J]. Computer Science, 2023, 50(3): 360-370.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Lightweight Network Security Vulnerability Risk Awareness Method Based on RAG

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0