Computer Science ›› 2021, Vol. 48 ›› Issue (1): 258-267.doi: 10.11896/jsjkx.200500078

• Information Security • Previous Articles     Next Articles

Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing

TONG Xin, WANG Bin-jun, WANG Run-zheng, PAN Xiao-qin   

  1. School of Information and Cyber Security,People's Public Security University of China,Beijing 100038,China
  • Received:2020-05-18 Revised:2020-08-25 Online:2021-01-15 Published:2021-01-15
  • About author:TONG Xin,born in 1995,postgraduate,is a member of China Computer Federation.His main research interests include adversarial examples and natural language processing.
    WANG Bin-jun,born in 1962,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include natural language processing and information security.
  • Supported by:
    2020 CCF-Nsfocus “Kunpeng” Research Fund(CCF-NSFOCUS 2020011),Science and Technology Strengthening Police Basic Program of Ministry of Public Security(2018GABJC03),Key Program of the National Social Science Foundation of China(20AZD114),Top Talent Training Special Funding Graduate Research and Innovation Project of People's Public Security University of China(2020ssky005),and Scientific Research and Technological Innovation on Public Security Behavior of People's Public Security University of China.

Abstract: Deep learning models have been proven to be vulnerable and easy to be attacked by adversarial examples,but the current researches on adversarial samples mainly focus on the field of computer vision and ignore the security of natural language processing models.In response to the same risk of adversarial samples faced in the field of natural language processing(NLP),this paper clarifies the concepts related to adversarial samples as the basis of further research.Firstly,it analyzes causes of vulnerabilities,including complex structure of the natural language processing model based on deep learning,the training process that is difficult to detect and the naive basic principles,further elaborates the characteristics,classification and evaluation metrics of text adversarial examples,and introduces the typical tasks and classical datasets involved in the adversarial examples related to researches in the field of natural language processing.Secondly,according to different perturbation levels,it sorts out various text adversarial examples generation technology of mainstream char-level,word-level,sentence-level and multi-level.What's more,it summarizes defense methods,which are relevant to data,models and inference,and compares their advantages and disadvantages.Finally,the pain points of both attack and defense sides in thefield of current NLP adversarial samples are further discussed and anticipated.

Key words: Natural language processing, Deep learning, AI security, Adversarial examples, Robustness

CLC Number: 

  • TP301
[1] HOCHREITER S,SCHMIDHUBER J.Long Short-Term Memory[J].Neural computation,1997,9(8):1735-1780.
[2] MIKOLOV T,CHEN K,CORRADO G,et al.Efficient Estimation of Word Representations in Vector Space[J].arXiv:1301.3781,2013.
[3] PENNINGTON J,SOCHER R,MANNING C.Glove:GlobalVectors for Word Representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing (EMNLP).2014:1532-1543.
[4] DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding[J].arXiv:1810.04805,2018.
[5] YANG Z,DAI Z,YANG Y,et al.XLNet:Generalized Autoregressive Pretraining for Language Understanding[C]//Advances in Neural Information Processing Systems.2019:5754-5764.
[6] WANG W,WANG L,TANG B,et al.Towards a Robust Deep Neural Network in Text Domain A Survey[J].arXiv:1902.07285,2019.
[7] SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[J].arXiv:1312.6199,2013.
[8] PAN W B,WANG X Y.Survey on Generating Adversarial Examples[J].Journal of Software,2020,31(1):67-81.
[9] ASHISH V.Attention is all you need[C]//Advances in Neural Information Processing Systems.2017:5998-6008.
[10] NIVEN T,KAO H Y.Probing Neural Network Comprehension of Natural Language Arguments[J].arXiv:1907.07355,2019.
[11] KUSNER M,SUN Y,KOLKIN N,et al.From word embeddings to document distances[C]//International Conference on Machine Learning.2015:957-966.
[12] HUANG G,GUO C,KUSNER M J,et al.Supervised WordMover's Distance[C]//Advances in Neural Information Processing Systems.2016:4862-4870.
[13] WU L.Word mover's embedding:From word2vec to document embedding[J].arXiv:1811.01713,2018.
[14] DONG Y,FU Q A,YANG X,et al.Benchmarking Adversarial Robustness[J].arXiv:1912.11852,2019.
[15] MICHEL P,LI X,NEUBIG G,et al.On Evaluation of Adversarial Perturbations for Sequence-to-Sequence Models[J].ar-Xiv:1903.06620,2019.
[16] GIANNA M D C,ANTONIO G,FRANCESCO R,et al.Ran-king a stream of news[C]//Proceedings of the 14th Internatio-nal Conference on World Wide Web.2005:97-106.
[17] RICHARD S.Recursive deep models for semantic compositio-nality over a sentiment Treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Proces-sing.2013:1631-1642.
[18] CETTOLO M,GIRARDI C,FEDERICO M.Wit3:Web inventory of transcribed and translated talks[C]//Conference of European Association for Machine Translation.2012:261-268.
[19] RAJPURKAR P,ZHANG J,LOPYREV K,et al.SQuAD:100 000+ Questions for Machine Comprehension of Text[J].arXiv:1606.05250,2016.
[20] RAJPURKAR P,JIA R,LIANG P.Know What You Don'tKnow:Unanswerable Questions for SQuAD[J].arXiv:1806.03822,2018.
[21] GOYAL Y,KHOT T,SUMMERS-STAY D,et al.Making the V in VQA matter:Elevating the role of image understanding in Visual Question Answering[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:6904-6913.
[22] BOWMAN S R,ANGELI G,POTTS C,et al.A large annotated corpus for learning natural language inference[J].arXiv:1508.05326,2015.
[23] WILLIAMS A,NANGIA N,BOWMAN S R.A broad-coverage challenge corpus for sentence understanding through inference[J].arXiv:1704.05426,2017.
[24] ERIK F,SANG T K,DE MEULDER F D.Introduction to the CoNLL-2003 shared task:Language-independent named entity recognition[J].arXiv:0306050,2003.
[25] BELINKOV Y,BISK Y.Synthetic and natural noise both break neural machine translation[J].arXiv:1711.02173,2017.
[26] GAO J,LANCHANTIN J,SOFFA M L,et al.Black-box generation of adversarial text sequences to evade deep learning classifiers[C]//2018 IEEE Security and Privacy Workshops (SPW).IEEE,2018:50-56.
[27] WANG W Q,WANG R.Adversarial Examples Generation Approach for Tendency Classification on Chinese Texts[J].Journal of Software,2019,30(8):2415-2427.
[28] EBRAHIMI J,LOWD D,DOU D.On adversarial examples for character-level neural machine translation[J].arXiv:1806.09030,2018.
[29] EGER S,?AHIN G G,RüCKLè A,et al.Text processing like humans do:Visually attacking and shielding NLP systems[J].arXiv:1903.11508,2019.
[30] PAPERNOT N,MCDANIEL P,SWAMI A,et al.Crafting adversarial input sequences for recurrent neural networks[C]//MILCOM 2016-2016 IEEE Military Communications Confe-rence.IEEE,2016:49-54.
[31] GOODFELLOW I J,SHLENS J,SZEGEDY C.Explaining and harnessing adversarial examples[J].arXiv:1412.6572,2014.
[32] JIN D,JIN Z,ZHOU J T,et al.Is BERT Really Robust?A Strong Baseline for Natural Language Attack on Text Classification and Entailment[J].AAAI2020,arXiv:1907.11932,2019.
[33] SAMANTA S,MEHTA S.Towards crafting text adversarial samples[J].arXiv:1707.02812,2017.
[34] SATO M,SUZUKI J,SHINDO H,et al.Interpretable adversarial perturbation in input embedding space for text[J].arXiv:1805.02917,2018.
[35] ZHANG H,ZHOU H,MIAO N,et al.Generating Fluent Adversarial Examples for Natural Languages[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:5564-5569.
[36] ALZANTOT M,SHARMA Y,ELGOHARY A,et al.Generating natural language adversarial examples[J].arXiv:1804.07998,2018.
[37] ZANG Y,YANG C,QI F,et al.Textual Adversarial Attack as Combinatorial Optimization[J].arXiv:1910.12196,2019.
[38] REN S,DENG Y,HE K,et al.Generating natural language adversarial examples through probability weighted word saliency[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.2019:1085-1097.
[39] JIA R,LIANG P.Adversarial examples for evaluating reading comprehension systems[J].arXiv:1707.07328,2017.
[40] MINERVINI P,RIEDEL S.Adversarially regularising neural nli models to integrate logical background knowledge[J].arXiv:1808.08609,2018.
[41] CHENG Y,JIANG L,MACHEREY W.Robust neural machine translation with doubly adversarial inputs[J].arXiv:1906.02443,2019.
[42] IYYER M,WIETING J,GIMPEL K,et al.Adversarial example generation with syntactically controlled paraphrase networks[J].arXiv:1804.06059,2018.
[43] ZHAO Z,DUA D,SINGH S.Generating natural adversarial examples[J].arXiv:1710.11342,2017.
[44] ARJOVSKY M,CHINTALA S,BOTTOU L.Wasserstein gan[J].arXiv:1701.07875,2017.
[45] WALLACE E,RODRIGUEZ P,FENG S,et al.Trick me if you can:Human-in-the-loop generation of adversarial examples for question answering[J].Transactions of the Association for Computational Linguistics,2019,7(2019):387-401.
[46] RIBEIRO M T,SINGH S,GUESTRIN C.Semantically equivalent adversarial rules for debugging nlp models[C]//Procee-dings of the 56th Annual Meeting of the Association for Computational Linguistics.2018:856-865.
[47] LI J,JI S,DU T,et al.Textbugger:Generating adversarial text against real-world applications[J].arXiv:1812.05271,2018.
[48] EBRAHIMI J,RAO A,LOWD D,et al.Hotflip:White-box adversarial examples for text classification[J].arXiv:1712.06751,2017.
[49] VIJAYARAGHAVAN P,ROY D.Generating Black-Box Ad-versarial Examples for Text Classifiers Using a Deep Reinforced Model[J].arXiv:1909.07873,2019.
[50] LIANG B,LI H,SU M,et al.Deep text classification can be fooled[J].arXiv:1704.08006,2017.
[51] GARDNER M,ARTZI Y,BASMOVA V,et al.Evaluating nlp models via contrast sets[J].arXiv:2004.02709,2020.
[52] PRUTHI D,DHINGRA B,LIPTON Z C.Combating adversarial misspellings with robust word recognition[J].arXiv:1905.11268,2019.
[53] ZHOU Y,JIANG J Y,CHANG K W,et al.Learning to discriminate perturbations for blocking adversarial attacks in text classification[J].arXiv:1909.03084,2019.
[54] TANAY T,GRIFFIN L D.A New Angle on L2 Regularization[J].arXiv:1806.11186,2018.
[55] PAPERNOT N,MCDANIEL P,WU X,et al.Distillation as a defense to adversarial perturbations against deep neural networks[C]//2016 IEEE Symposium on Security and Privacy(SP).IEEE,2016:582-597.
[56] MIYATO T,DAI A M,GOODFELLOW I.Adversarial training methods for semi-supervised text classification[J].arXiv:1605.07725,2016.
[57] MADRY A,MAKELOV A,SCHMIDT L,et al.Towards deep learning models resistant to adversarial attacks[J].arXiv:1706.06083,2017.
[58] LI L,QIU X.TextAT:Adversarial Training for Natural Language Understanding with Token-Level Perturbation[J].arXiv:2004.14543,2020.
[59] DINAN E,HUMEAU S,CHINTAGUNTA B,et al.Build itbreak it fix it for dialogue safety:Robustness from adversarial human attack[J].arXiv:1908.06083,2019.
[60] HE W,WEI J,CHEN X,et al.Adversarial example defense:Ensembles of weak defenses are not strong[C]//11th USENIX Workshop on Offensive Technologies (WOOT 17).2017.
[61] KO C Y,LYU Z,WENG T W,et al.POPQORN:Quantifying robustness of recurrent neural networks[J].arXiv:1905.07387,2019.
[62] SHI Z,ZHANG H,CHANG K W,et al.Robustness verification for transformers[J].arXiv:2002.06622,2020.
[63] GOODMAN D,XIN H,YANG W,et al.Advbox:a toolbox to generate adversarial examples that fool neural networks[J].arXiv:2001.05574,2020.
[64] ATHALYE A,CARLINI N,WAGNER D.Obfuscated gradi-ents give a false sense of security:Circumventing defenses to adversarial examples[J].arXiv:1802.00420,2018.
[65] WALLACE E,FENG S,KANDPAL N,et al.Universal adversarial triggers for nlp[J].arXiv:1908.07125,2019.
[66] LIANG R G,LYU P Z,et al.A Survey of Audiovisual Deepfake Detection Techniques[J].Journal of Cyber Security,2020,5(2):1-17.
[67] YU L,ZHANG W,et al.Seqgan:Sequence generative adversarial nets with policy gradient[C]//Thirty-First AAAI Conference on Artificial Intelligence.2017.
[1] WANG Rui-ping, JIA Zhen, LIU Chang, CHEN Ze-wei, LI Tian-rui. Deep Interest Factorization Machine Network Based on DeepFM [J]. Computer Science, 2021, 48(1): 226-232.
[2] YU Wen-jia, DING Shi-fei. Conditional Generative Adversarial Network Based on Self-attention Mechanism [J]. Computer Science, 2021, 48(1): 241-246.
[3] LU Long-long, CHEN Tong, PAN Min-xue, ZHANG Tian. CodeSearcher:Code Query Using Functional Descriptions in Natural Languages [J]. Computer Science, 2020, 47(9): 1-9.
[4] DING Yu, WEI Hao, PAN Zhi-song, LIU Xin. Survey of Network Representation Learning [J]. Computer Science, 2020, 47(9): 52-59.
[5] TIAN Ye, SHOU Li-dan, CHEN Ke, LUO Xin-yuan, CHEN Gang. Natural Language Interface for Databases with Content-based Table Column Embeddings [J]. Computer Science, 2020, 47(9): 60-66.
[6] HE Xin, XU Juan, JIN Ying-ying. Action-related Network:Towards Modeling Complete Changeable Action [J]. Computer Science, 2020, 47(9): 123-128.
[7] YE Ya-nan, CHI Jing, YU Zhi-ping, ZHAN Yu-liand ZHANG Cai-ming. Expression Animation Synthesis Based on Improved CycleGan Model and Region Segmentation [J]. Computer Science, 2020, 47(9): 142-149.
[8] DENG Liang, XU Geng-lin, LI Meng-jie, CHEN Zhang-jin. Fast Face Recognition Based on Deep Learning and Multiple Hash Similarity Weighting [J]. Computer Science, 2020, 47(9): 163-168.
[9] BAO Yu-xuan, LU Tian-liang, DU Yan-hui. Overview of Deepfake Video Detection Technology [J]. Computer Science, 2020, 47(9): 283-292.
[10] YUAN Ye, HE Xiao-ge, ZHU Ding-kun, WANG Fu-lee, XIE Hao-ran, WANG Jun, WEI Ming-qiang, GUO Yan-wen. Survey of Visual Image Saliency Detection [J]. Computer Science, 2020, 47(7): 84-91.
[11] WANG Wen-dao, WANG Run-ze, WEI Xin-lei, QI Yun-liang, MA Yi-de. Automatic Recognition of ECG Based on Stacked Bidirectional LSTM [J]. Computer Science, 2020, 47(7): 118-124.
[12] LIU Yan, WEN Jing. Complex Scene Text Detection Based on Attention Mechanism [J]. Computer Science, 2020, 47(7): 135-140.
[13] ZHANG Zhi-yang, ZHANG Feng-li, TAN Qi, WANG Rui-jin. Review of Information Cascade Prediction Methods Based on Deep Learning [J]. Computer Science, 2020, 47(7): 141-153.
[14] JIANG Wen-bin, FU Zhi, PENG Jing, ZHU Jian. 4Bit-based Gradient Compression Method for Distributed Deep Learning System [J]. Computer Science, 2020, 47(7): 220-226.
[15] ZHANG Ying, ZHANG Yi-fei, WANG Zhong-qing and WANG Hong-ling. Automatic Summarization Method Based on Primary and Secondary Relation Feature [J]. Computer Science, 2020, 47(6A): 6-11.
Full text



[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[3] ZHU Shu-qin, WANG Wen-hong and LI Jun-qing. Chosen Plaintext Attack on Chaotic Image Encryption Algorithm Based on Perceptron Model[J]. Computer Science, 2018, 45(4): 178 -181 .
[4] HOU Yan-e, KONG Yun-feng and DANG Lan-xue. Greedy Randomized Adaptive Search Procedure Algorithm Combining Set Partitioning for Heterogeneous School Bus Routing Problems[J]. Computer Science, 2018, 45(4): 240 -246 .
[5] WEI Qin-shuang, WU You-xi, LIU Jing-yu and ZHU Huai-zhong. Distinguishing Sequence Patterns Mining Based on Density and Gap Constraints[J]. Computer Science, 2018, 45(4): 252 -256 .
[6] LI Hui, ZHOU Lin and XIN Wen-bo. Optimization of Networked Air-defense Operational Formation Structure Based on Bilevel Programming[J]. Computer Science, 2018, 45(4): 266 -272 .
[7] ZHAO Li-bo, LIU Qi, FU Fang-ling and HE Ling. Automatic Detection of Hypernasality Grades Based on Discrete Wavelet Transformation and Cepstrum Analysis[J]. Computer Science, 2018, 45(4): 278 -284 .
[8] DAI Wen-jing, YUAN Jia-bin. Survey on Hidden Subgroup Problem[J]. Computer Science, 2018, 45(6): 1 -8 .
[9] YANG Pei-an, WU Yang, SU Li-ya, LIU Bao-xu. Overview of Threat Intelligence Sharing Technologies in Cyberspace[J]. Computer Science, 2018, 45(6): 9 -18 .
[10] HU Ya-peng, DING Wei-long, WANG Gui-ling. Monitoring and Dispatching Service for Heterogeneous Big Data Computing Frameworks[J]. Computer Science, 2018, 45(6): 67 -71 .