面向自然语言处理的深度学习对抗样本综述

doi:10.11896/jsjkx.200500078

Abstract

Abstract: Deep learning models have been proven to be vulnerable and easy to be attacked by adversarial examples,but the current researches on adversarial samples mainly focus on the field of computer vision and ignore the security of natural language processing models.In response to the same risk of adversarial samples faced in the field of natural language processing(NLP),this paper clarifies the concepts related to adversarial samples as the basis of further research.Firstly,it analyzes causes of vulnerabilities,including complex structure of the natural language processing model based on deep learning,the training process that is difficult to detect and the naive basic principles,further elaborates the characteristics,classification and evaluation metrics of text adversarial examples,and introduces the typical tasks and classical datasets involved in the adversarial examples related to researches in the field of natural language processing.Secondly,according to different perturbation levels,it sorts out various text adversarial examples generation technology of mainstream char-level,word-level,sentence-level and multi-level.What's more,it summarizes defense methods,which are relevant to data,models and inference,and compares their advantages and disadvantages.Finally,the pain points of both attack and defense sides in thefield of current NLP adversarial samples are further discussed and anticipated.

Key words: Adversarial examples, AI security, Deep learning, Natural language processing, Robustness

CLC Number:

TP301

TONG Xin, WANG Bin-jun, WANG Run-zheng, PAN Xiao-qin. Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing[J].Computer Science, 2021, 48(1): 258-267.

References

[1] HOCHREITER S,SCHMIDHUBER J.Long Short-Term Memory[J].Neural computation,1997,9(8):1735-1780.
[2] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[J].arXiv:1301.3781,2013.
[3] PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[4] DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2018.
[5] YANG Z,DAI Z,YANG Y,et al.XLNet:Generalized Autoregressive Pretraining for Language Understanding[C]//Advances in Neural Information Processing Systems.2019:5754-5764.
[6] WANG W,WANG L,TANG B,et al.Towards a Robust Deep Neural Network in Text Domain A Survey[J].arXiv:1902.07285,2019.
[7] SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[J].arXiv:1312.6199,2013.
[8] PAN W B,WANG X Y.Survey on Generating Adversarial Examples[J].Journal of Software,2020,31(1):67-81.
[9] ASHISH V.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[10] NIVEN T,KAO H Y.Probing Neural Network Comprehension of Natural Language Arguments[J].arXiv:1907.07355,2019.
[11] KUSNER M,SUN Y,KOLKIN N,et al.From word embeddings to document distances[C]//International Conference on Machine Learning.2015:957-966.
[12] HUANG G,GUO C,KUSNER M J,et al.Supervised WordMover's Distance[C]//Advances in Neural Information Processing Systems.2016:4862-4870.
[13] WU L.Word mover's embedding:From word2vec to document embedding[J].arXiv:1811.01713,2018.
[14] DONG Y,FU Q A,YANG X,et al.Benchmarking Adversarial Robustness[J].arXiv:1912.11852,2019.
[15] MICHEL P,LI X,NEUBIG G,et al.On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models[J].ar-Xiv:1903.06620,2019.
[16] GIANNA M D C,ANTONIO G,FRANCESCO R,et al.Ran-king a stream of news[C]//Proceedings of the 14th Internatio-nal Conference on World Wide Web.2005:97-106.
[17] RICHARD S.Recursive deep models for semantic compositio-nality over a sentiment Treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Proces-sing.2013:1631-1642.
[18] CETTOLO M,GIRARDI C,FEDERICO M.Wit3:Web inventory of transcribed and translated talks[C]//Conference of European Association for Machine Translation.2012:261-268.
[19] RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:100 000+ Questions for Machine Comprehension of Text[J].arXiv:1606.05250,2016.
[20] RAJPURKAR P,JIA R,LIANG P.Know What You Don'tKnow:Unanswerable Questions for SQuAD[J].arXiv:1806.03822,2018.
[21] GOYAL Y,KHOT T,SUMMERS-STAY D,et al.Making the V in VQA matter:Elevating the role of image understanding in Visual Question Answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:6904-6913.
[22] BOWMAN S R,ANGELI G,POTTS C,et al.A large annotated corpus for learning natural language inference[J].arXiv:1508.05326,2015.
[23] WILLIAMS A,NANGIA N,BOWMAN S R.A broad-coverage challenge corpus for sentence understanding through inference[J].arXiv:1704.05426,2017.
[24] ERIK F,SANG T K,DE MEULDER F D.Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition[J].arXiv:0306050,2003.
[25] BELINKOV Y,BISK Y.Synthetic and natural noise both break neural machine translation[J].arXiv:1711.02173,2017.
[26] GAO J,LANCHANTIN J,SOFFA M L,et al.Black-box generation of adversarial text sequences to evade deep learning classifiers[C]//2018 IEEE Security and Privacy Workshops (SPW).IEEE,2018:50-56.
[27] WANG W Q,WANG R.Adversarial Examples Generation Approach for Tendency Classification on Chinese Texts[J].Journal of Software,2019,30(8):2415-2427.
[28] EBRAHIMI J,LOWD D,DOU D.On adversarial examples for character-level neural machine translation[J].arXiv:1806.09030,2018.
[29] EGER S,?AHIN G G,RüCKLè A,et al.Text processing like humans do:Visually attacking and shielding NLP systems[J].arXiv:1903.11508,2019.
[30] PAPERNOT N,MCDANIEL P,SWAMI A,et al.Crafting adversarial input sequences for recurrent neural networks[C]//MILCOM 2016-2016 IEEE Military Communications Confe-rence.IEEE,2016:49-54.
[31] GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014.
[32] JIN D,JIN Z,ZHOU J T,et al.Is BERT Really Robust？A Strong Baseline for Natural Language Attack on Text Classification and Entailment[J].AAAI2020,arXiv:1907.11932,2019.
[33] SAMANTA S,MEHTA S.Towards crafting text adversarial samples[J].arXiv:1707.02812,2017.
[34] SATO M,SUZUKI J,SHINDO H,et al.Interpretable adversarial perturbation in input embedding space for text[J].arXiv:1805.02917,2018.
[35] ZHANG H,ZHOU H,MIAO N,et al.Generating Fluent Adversarial Examples for Natural Languages[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5564-5569.
[36] ALZANTOT M,SHARMA Y,ELGOHARY A,et al.Generating natural language adversarial examples[J].arXiv:1804.07998,2018.
[37] ZANG Y,YANG C,QI F,et al.Textual Adversarial Attack as Combinatorial Optimization[J].arXiv:1910.12196,2019.
[38] REN S,DENG Y,HE K,et al.Generating natural language adversarial examples through probability weighted word saliency[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:1085-1097.
[39] JIA R,LIANG P.Adversarial examples for evaluating reading comprehension systems[J].arXiv:1707.07328,2017.
[40] MINERVINI P,RIEDEL S.Adversarially regularising neural nli models to integrate logical background knowledge[J].arXiv:1808.08609,2018.
[41] CHENG Y,JIANG L,MACHEREY W.Robust neural machine translation with doubly adversarial inputs[J].arXiv:1906.02443,2019.
[42] IYYER M,WIETING J,GIMPEL K,et al.Adversarial example generation with syntactically controlled paraphrase networks[J].arXiv:1804.06059,2018.
[43] ZHAO Z,DUA D,SINGH S.Generating natural adversarial examples[J].arXiv:1710.11342,2017.
[44] ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein gan[J].arXiv:1701.07875,2017.
[45] WALLACE E,RODRIGUEZ P,FENG S,et al.Trick me if you can:Human-in-the-loop generation of adversarial examples for question answering[J].Transactions of the Association for Computational Linguistics,2019,7(2019):387-401.
[46] RIBEIRO M T,SINGH S,GUESTRIN C.Semantically equivalent adversarial rules for debugging nlp models[C]//Procee-dings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:856-865.
[47] LI J,JI S,DU T,et al.Textbugger:Generating adversarial text against real-world applications[J].arXiv:1812.05271,2018.
[48] EBRAHIMI J,RAO A,LOWD D,et al.Hotflip:White-box adversarial examples for text classification[J].arXiv:1712.06751,2017.
[49] VIJAYARAGHAVAN P,ROY D.Generating Black-Box Ad-versarial Examples for Text Classifiers Using a Deep Reinforced Model[J].arXiv:1909.07873,2019.
[50] LIANG B,LI H,SU M,et al.Deep text classification can be fooled[J].arXiv:1704.08006,2017.
[51] GARDNER M,ARTZI Y,BASMOVA V,et al.Evaluating nlp models via contrast sets[J].arXiv:2004.02709,2020.
[52] PRUTHI D,DHINGRA B,LIPTON Z C.Combating adversarial misspellings with robust word recognition[J].arXiv:1905.11268,2019.
[53] ZHOU Y,JIANG J Y,CHANG K W,et al.Learning to discriminate perturbations for blocking adversarial attacks in text classification[J].arXiv:1909.03084,2019.
[54] TANAY T,GRIFFIN L D.A New Angle on L2 Regularization[J].arXiv:1806.11186,2018.
[55] PAPERNOT N,MCDANIEL P,WU X,et al.Distillation as a defense to adversarial perturbations against deep neural networks[C]//2016 IEEE Symposium on Security and Privacy(SP).IEEE,2016:582-597.
[56] MIYATO T,DAI A M,GOODFELLOW I.Adversarial training methods for semi-supervised text classification[J].arXiv:1605.07725,2016.
[57] MADRY A,MAKELOV A,SCHMIDT L,et al.Towards deep learning models resistant to adversarial attacks[J].arXiv:1706.06083,2017.
[58] LI L,QIU X.TextAT:Adversarial Training for Natural Language Understanding with Token-Level Perturbation[J].arXiv:2004.14543,2020.
[59] DINAN E,HUMEAU S,CHINTAGUNTA B,et al.Build itbreak it fix it for dialogue safety:Robustness from adversarial human attack[J].arXiv:1908.06083,2019.
[60] HE W,WEI J,CHEN X,et al.Adversarial example defense:Ensembles of weak defenses are not strong[C]//11th USENIX Workshop on Offensive Technologies (WOOT 17).2017.
[61] KO C Y,LYU Z,WENG T W,et al.POPQORN:Quantifying robustness of recurrent neural networks[J].arXiv:1905.07387,2019.
[62] SHI Z,ZHANG H,CHANG K W,et al.Robustness verification for transformers[J].arXiv:2002.06622,2020.
[63] GOODMAN D,XIN H,YANG W,et al.Advbox:a toolbox to generate adversarial examples that fool neural networks[J].arXiv:2001.05574,2020.
[64] ATHALYE A,CARLINI N,WAGNER D.Obfuscated gradi-ents give a false sense of security:Circumventing defenses to adversarial examples[J].arXiv:1802.00420,2018.
[65] WALLACE E,FENG S,KANDPAL N,et al.Universal adversarial triggers for nlp[J].arXiv:1908.07125,2019.
[66] LIANG R G,LYU P Z,et al.A Survey of Audiovisual Deepfake Detection Techniques[J].Journal of Cyber Security,2020,5(2):1-17.
[67] YU L,ZHANG W,et al.Seqgan:Sequence generative adversarial nets with policy gradient[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0

[1]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[2]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[3]	XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[4]	WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[5]	HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification [J]. Computer Science, 2022, 49(8): 323-329.
[6]	JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8]	YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[9]	HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[10]	ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[11]	SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[12]	HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.
[13]	CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[14]	WANG Jun-feng, LIU Fan, YANG Sai, LYU Tan-yue, CHEN Zhi-yu, XU Feng. Dam Crack Detection Based on Multi-source Transfer Learning [J]. Computer Science, 2022, 49(6A): 319-324.
[15]	CHU Yu-chun, GONG Hang, Wang Xue-fang, LIU Pei-shun. Study on Knowledge Distillation of Target Detection Algorithm Based on YOLOv4 [J]. Computer Science, 2022, 49(6A): 337-344.