计算机科学 ›› 2025, Vol. 52 ›› Issue (12): 428-434.doi: 10.11896/jsjkx.250500005
• 信息安全 • 上一篇
张錋, 张道娟, 陈凯, 赵宇飞, 张英杰, 费克雄
ZHANG Peng, ZHANG Daojuan, CHEN Kai, ZHAO Yufei, ZHANG Yingjie, FEI Kexiong
摘要: 虽然自然语言处理(Natural Language Processing,NLP)模型在各类文本分类任务中表现优异,但面对对抗性攻击时依然存在较大脆弱性。为应对这一问题,提出了一种创新性的检索增强分类方法,有效提升了模型在对抗环境下的鲁棒性。该方法引入了 k-最近邻(K-Nearest-Neighbor,KNN)检索机制,将模型自身的标签预测结果与检索到的相似样本标签分布相结合,使模型在遭受攻击时能做出更为稳健的判断。该方法的一大创新在于将分类与检索所用的表示空间分开设计,从而避免了共享表示带来的性能下降和训练不稳定。通过在多种基准数据集和多样化对抗攻击场景下的实验,证明了所提出的方法显著提升了模型的鲁棒性:在对抗攻击下,可使模型准确率下30个百分点到40个百分点,即使在强烈攻击下依然能够保持较为稳定的表现。大量实验进一步验证了该方法的有效性,表明检索增强分类和解耦表示对于构建更可靠的系统具有重要意义。
中图分类号:
| [1]LU S Y,LIU M Z,YIN L R,et al.The multi-modal fusion in visual question answering:a review of attention mechanisms[J].PeerJ Comuter Science,2023,9:e1400. [2]OMAR R,MANGUKIYA O,KALNIS P,et al.Chatgpt versus traditional question answering for knowledge graphs:Current status and future directions towards knowledge graph chatbots[J].arXiv:2302.06466,2023. [3]ZHUANG Y C,YU Y,WANG K,et al.Toolqa:A dataset for llm question answering with external tools[J].Advances in Neural Information Processing Systems,2023,36:50117-50143. [4]LI B Z,DONATELLI L,KOLLER A,et al.Slog:A structural generalization benchmark for semantic parsing[J].arXiv:2310.15040,2023. [5]ZHUO T Y,LI Z,HUANG Y J,et al.On robustness of prompt-based semantic parsing with large pre-trained language model:An empirical study on codex[J].arXiv:2301.12868,2023. [6]CHEN Y R,ZHANG S Y,QI G L,et al.Parameterizing con-text:Unleashing the power of parameter-efficient fine-tuning and in-context tuning for continual table semantic parsing[C]//Advances in Neural Information Processing Systems.2024. [7]HUI B Y,YANG J,CUI Z Y,et al.Qwen2.5-coder technical report[J].arXiv:2409.12186,2024. [8]LIU S K,CHAI L Z,YANG J,et al.Mdeval:Massively multilingual code debugging[J].arXiv:2411.02310,2024. [9]CHAI L Z,LIU S K,YANG J,et al.Mceval:Massively multilingual code evaluation[J].arXiv:2406.07436,2024. [10]GARRIDO-MERCHAN E C,GOZALO-BRIZUELA R,GONZ-ALEZ-CARVAJAL S.Comparing bert against traditional machine learning models in text classification[J].Journal of Computational and Cognitive Engineering,2023,2(4):352-356. [11]BEKAMIRI H,HAIN D S,JUROWETZKI R.Patentsberta:A deep nlp based hybrid model for patent distance and classification using augmented sbert[J].Technological Forecasting and Social Change,2024,206:123536. [12]OLUSEGUN R,OLADUNNI T,AUDU H,et al.Text mining and emotion classification on monkeypox twitter dataset:A deep learning-natural language processing(nlp) approach[J].IEEE Access,2023,11:49882-49894. [13]SHAYEGANI E,AL MAMUN M A,FU Y,et al.Survey of vulnerabilities in large language models revealed by adversarial attacks[J].arXiv:2310.10844,2023. [14]LIU S B,LIU G R,ZHU B R,et al.Balancing innovation and privacy:Data security strategies in natural language processing applications[C]//2024 5th International Conference on Machine Learning and Computer Application(ICMLCA).IEEE,2024:609-613. [15]TAN K L,LEE C P,LIM K M.A survey of sentiment analysis:Approaches,datasets,and future research[J].Applied Sciences,2023,13(7):4550. [16]KOZYREVA A,HERZOG S M,LEWANDOWSKY S,et al.Resolving content moderation dilemmas between free speech and harmful misinformation[C]//Proceedings of the National Academy of Sciences.2023. [17]MOTIE S,RAAHEMI B.Financial fraud detection using graph neural networks:A systematic review[J].Expert Systems with Applications,2024,240:122156. [18]GAO Y,CAO Z W,MIAO Z J,et al.Efficient k-nearest-neighbor machine translation with dynamic retrieval[J].arXiv:2406.06073,2024. [19]GUO G D,WANG H,BELL D,et al.Knn model-based approach in classification[C]//On The Move to Meaningful Internet Systems 2003:CoopIS,DOA,and ODBAS.Berlin:Springer,2003:986-996. [20]KHANDELWAL U,LEVY O,JURAFSKY D,et al.Generalization through Memorization:Nearest Neighbor Language Models[C]//International Conference on Learning Representations(ICLR).2020. [21]KHANDELWAL U,FAN A,JURAFSKY D,et al.Nearestneighbor machine translation[C]//International Conference on Learning Representations(ICLR).2021. [22]SU X A,WANG R,DAI X Y.Contrastive learning-enhancednearest neighbor mechanism for multi-label text classification[C]//Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics.ACL,2022:672-679. [23]GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[C]//Proceedings of the International Conference on Learning Representations(ICLR).2015. [24]MADRY A,MAKELOV A,SCHMIDT L,et al.Towards deep learning models resistant to adversarial attacks[C]//Procee-dings of the International Conference on Learning Representations(ICLR).2018. [25]HU H,RICHARDSON K,XU L,et al.OCNLI:Original Chi-nese Natural Language Inference[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.ACL,2020:3512-3526. [26]WILLIAMS A,NANGIA N,BOWMAN S.A broad-coveragechallenge corpus for sentence understanding through inference[C]//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.ACL,2018:1112-1122. [27]CUI Y M,CHE W X,LIU T,et al.Revisiting pre-trained models for Chinese natural language processing[C]//Findings of the Association for Computational Linguistics:EMNLP 2020.ACL,2020:657-668. [28]CUI Y M,CHE W X,LIU T,et al.Pre-training with whole word masking for chinese bert[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:3504-3514. [29]LIU Y H,OTT M,GOYAL N,et al.Roberta:A robustly optimized BERT pretraining approach[J].arXiv:1907.11692,2019. [30]DONG Y P,LIAO F Z,PANG T Y,et al.Boosting adversarial attacks with momentum[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.2018:9185-9193. [31]YE M C,CHEN J ,MIAO C L,et al.Leapattack:Hard-label adversarial attack on text via gradient-based optimization[C]//Proceedings of the 28th ACM SIGKDD Conference on Know-ledge Discovery and Data Mining.2022:2307-2315. [32]LEWIS P,PEREZ E,PIKTUS A,et al.Retrieval-augmentedgeneration for knowledge-intensive nlp tasks[J].Advances in Neural Information Processing Systems,2020,33:9459-9474. [33]LIU S J,WU J,BAO J Y,et al.Towards a robust retrieval-based summarization system[J].arXiv:2403.19889,2024. [34]SIRIWARDHANA S,WEERASEKERA R,WEN E T,et al.Improving the domain adaptation of retrieval augmented generation models for open domain question answering[J].Transactions of the Association for Computational Linguistics,2023,11:1-17. [35]ZHU Y H,REN C Y,XIE S Y,et al.Realm:Rag-driven en-hancement of multimodal electronic health records analysis via large language models[J].arXiv:2402.07016,2024. [36]WU S Y,XIONG Y,CUI Y F,et al.Retrieval-augmented gene-ration for natural language processing:A survey[J].arXiv:2407.13193,2024. [37]DEVLIN J,CHANG M W,LEE K,et al.Bert:Pre-training ofdeep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-4186. [38]BROWN T,MANN B,RYDER N,et al.Language models arefew-shot learners[J].Advances in Neural Information Proces-sing Systems,2020,33:1877-1901. [39]RIBEIRO M T,WU T,GUESTRIN C,et al.Beyond accuracy:Behavioral testing of NLP models withCheckList[J].arXiv:2005.04118,2020. [40]YOO K Y,KIM J,JANG J,et al.Detection of word adversarial examples in text classification:Benchmark and baseline via robust density estimation[J].arXiv:2203.01677,2022. |
|
||