面向文本分类的类别区分式通用对抗攻击方法

doi:10.11896/jsjkx.220200077

Abstract

Abstract: The definition of universal adversarial attack is that the text classifiers can be successfully fooled by a fixed sequence of perturbations appended to any inputs.But textual examples from all classes are indiscriminately attacked by the existing UAA,which is easy to attract the attention of the defense system.For more stealth attack,a simple and efficient class discriminative universal adversarial attack method is proposed,which has an obvious attack effect on textual examples from the targeted classes and limited influence on the non-targeted classes.In the case of white-box attack,multiple candidate perturbation sequences are searched by using the average gradient of the perturbation sequence in each batch.The perturbation sequence with the smallest loss is selected for the next iteration until no new perturbation sequence is generated.Comprehensive experiments are conducted on four public Chinese and English datasets and TextCNN,BiLSTM to evaluate the effectiveness of the proposed method.Experimental results show that the proposed attack method can discriminatively attack the targeted and non-targeted classes,and has certain transferability.

Key words: Class discriminative, Deep learning, Neural Networks, Text classification, Universal adversarial attack

CLC Number:

TP183

HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification[J].Computer Science, 2022, 49(8): 323-329.

References

[1]LIANG B,LI H,SU M,et al.Deep text classification can be fooled[C]//Proceedings of the 27th International Joint Confe-rence on Artificial Intelligence.Stockholm:AAAI Press,2018:4208-4215.
[2]WANG W Q,WANG R,WANG L N,et al.Adversarial examples generation approach for tendency classification on Chinese texts[J].Ruan Jian Xue Bao/Journal of Software,2019,30(8):2415-2427.
[3]TONG X,WANG L N,WANG R Z,et al.A Generation Method of Word-level Adversarial Samples for Chinese Text Classification[J].Netinfo Security,2020,20(9):12-16.
[4]CHENG M,YI J,CHEN P Y,et al.Seq2sick:Evaluating the robustness of sequence-to-sequence models with adversarial examples[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI Press,2020,34(4):3601-3608.
[5]ILYAS A,SANTURKAR S,TSIPRAS D,et al.Adversarial examples are not bugs,they are features[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.Red Hook:Curran Associates Inc,2019:125-136.
[6]ZHANG C,BENZ P,IMTIAZ T,et al.Understanding adve-rsarial examples from the mutual influence of images and perturbations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE Press,2020:14521-14530.
[7]TONG X,WANG B J,WANG R Z,et al.Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing[J].Computer Science,2021,48(1):258-267.
[8]BEHJATI M,MOOSAVI-DEZFOOLI S M,BAGHSHAH M S,et al.Universal adversarial attacks on text classifiers[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).Brighton:IEEE Press,2019:7345-7349.
[9]WALLACE E,FENG S,KANDPAL N,et al.Universal Adversarial Triggers for Attacking and Analyzing NLP[C]//Procee-dings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Hong Kong:ACL Press,2019:2153-2162.
[10]SONG L,YU X,PENG H T,et al.Universal Adversarial Attacks with Natural Triggers for Text Classification[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Online:ACL Press,2021:3724-3733.
[11]MOOSAVI-DEZFOOLI S M,FAWZI A,FAWZI O,et al.Universal adversarial perturbations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:1765-1773.
[12]HEIDENREICH H S,WILLIAMS J R.The Earth Is Flat and the Sun Is Not a Star:The Susceptibility of GPT-2 to Universal Adversarial Triggers[C]//Proceedings of the 2021 AAAI/ACM Conference on AI,Ethics,and Society.New York:ACM Press,2021:566-573.
[13]GUPTA T,SINHA A,KUMARI N,et al.A method for computing class-wise universal adversarial perturbations[J].arXiv:1912.00466,2019.
[14]ZHANG C,BENZ P,IMTIAZ T,et al.Cd-uap:Class discriminative universal adversarial perturbation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI Press,2020,34(4):6754-6761.
[15]BENZ P,ZHANG C,IMTIAZ T,et al.Double targeted universal adversarial perturbations[C]//Proceedings of the Asian Conference on Computer Vision.Kyoto:ACCVPress,2020:1-17.
[16]EBRAHIMI J,RAO A,LOWD D,et al.HotFlip:White-Box Adversarial Examples for Text Classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers).Melbourne:ACL Press,2018:31-36.
[17]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Seattle:ACL Press,2013:1631-1642.
[18]ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networks for text classification[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems (Volume 1).Cambridge,MA,USA:MIT Press,2015:649-657.
[19]LI L,SHAO Y,SONG D,et al.Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces[J].arXiv:2012.14769,2020.
[20]KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Doha:ACL Press,2014.1746-1751.
[21]MCCANN B,BRADBURY J,XIONG C,et al.Learned inTranslation:Contextualized Word Vectors[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Red Hook,NY,USA:Curran Associates Inc,2017:6297-6308.
[22]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).Doha:ACL Press,2014:1532-1543.
[23]LI S,ZHAO Z,HU R,et al.Analogical Reasoning on Chinese Morphological and Semantic Relations[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Short Papers).Melbourne:ACL Press,2018:138-143.

Related Articles 15

[1]	XU Yong-xin, ZHAO Jun-feng, WANG Ya-sha, XIE Bing, YANG Kai. Temporal Knowledge Graph Representation Learning [J]. Computer Science, 2022, 49(9): 162-171.
[2]	RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[3]	NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[4]	TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[5]	WANG Jian, PENG Yu-qi, ZHAO Yu-fei, YANG Jian. Survey of Social Network Public Opinion Information Extraction Based on Deep Learning [J]. Computer Science, 2022, 49(8): 279-293.
[6]	JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[7]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[8]	SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[9]	TAN Ying-ying, WANG Jun-li, ZHANG Chao-bo. Review of Text Classification Methods Based on Graph Convolutional Network [J]. Computer Science, 2022, 49(8): 205-216.
[10]	YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[11]	WU Hong-xin, HAN Meng, CHEN Zhi-qiang, ZHANG Xi-long, LI Mu-hang. Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning [J]. Computer Science, 2022, 49(8): 12-25.
[12]	HOU Yu-tao, ABULIZI Abudukelimu, ABUDUKELIMU Halidanmu. Advances in Chinese Pre-training Models [J]. Computer Science, 2022, 49(7): 148-163.
[13]	ZHOU Hui, SHI Hao-chen, TU Yao-feng, HUANG Sheng-jun. Robust Deep Neural Network Learning Based on Active Sampling [J]. Computer Science, 2022, 49(7): 164-169.
[14]	SU Dan-ning, CAO Gui-tao, WANG Yan-nan, WANG Hong, REN He. Survey of Deep Learning for Radar Emitter Identification Based on Small Sample [J]. Computer Science, 2022, 49(7): 226-235.
[15]	HU Yan-yu, ZHAO Long, DONG Xiang-jun. Two-stage Deep Feature Selection Extraction Algorithm for Cancer Classification [J]. Computer Science, 2022, 49(7): 73-78.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Class Discriminative Universal Adversarial Attack for Text Classification

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0