面向文本分类的类别区分式通用对抗攻击方法

doi:10.11896/jsjkx.220200077

计算机科学 ›› 2022, Vol. 49 ›› Issue (8): 323-329.doi: 10.11896/jsjkx.220200077

面向文本分类的类别区分式通用对抗攻击方法

郝志荣¹, 陈龙^1,2, 黄嘉成¹

1 重庆邮电大学计算机科学与技术学院重庆 400065
2 重庆邮电大学网络空间安全与信息法学院重庆 400065

收稿日期:2022-02-15 修回日期:2022-03-24 发布日期:2022-08-02
通讯作者: 陈龙(chenlong@cqupt.edu.cn)
作者简介:(s190201031@stu.cqupt.edu.cn)
基金资助:
重庆市教委重点合作项目(HZ2021008)

Class Discriminative Universal Adversarial Attack for Text Classification

HAO Zhi-rong¹, CHEN Long^1,2, HUANG Jia-cheng¹

1 School of Computer Science and Technology,Chongqing University of Posts and Telecommunications,Chongqing 400065,China
2 School of Cyber Security and Information Law,Chongqing University of Posts and Telecommunications,Chongqing 400065,China

Received:2022-02-15 Revised:2022-03-24 Published:2022-08-02
About author:HAO Zhi-rong,born in 1997,postgra-duate.His main research interests include adversarial examples and natural language processing.
CHEN Long,born in 1970,professor,Ph.D supervisor.His main research interests include digital forensics and AI security.
Supported by:
Key Cooperation Project of Chongqing Municipal Education Commission(HZ2021008).

摘要/Abstract

摘要： 通用对抗攻击只需向任意输入添加一个固定的扰动序列,就可以成功混淆文本分类器,但是其会不加区分地攻击所有类别的文本样本,容易引起防御系统的注意。为了实现攻击的隐蔽性,文中提出了一种简单高效的类别区分式通用对抗攻击方法,突出对目标类别的文本样本有攻击效果,并尽量对非目标类别不产生影响。在白盒攻击的场景下,利用扰动序列在每个批次上的平均梯度搜索得到多个候选扰动序列,选择损失最小的扰动序列进行下一轮迭代,直到没有新的扰动序列产生。在4个公开的中英文数据集以及神经网络模型TextCNN和BiLSTM上进行了大量的实验,以评估所提方法的有效性,实验结果表明,该攻击方法可以实现对目标类别和非目标类别的区分式攻击,而且具有一定的迁移性。

关键词: 类别区分式, 深度学习, 神经网络, 通用对抗攻击, 文本分类

Abstract: The definition of universal adversarial attack is that the text classifiers can be successfully fooled by a fixed sequence of perturbations appended to any inputs.But textual examples from all classes are indiscriminately attacked by the existing UAA,which is easy to attract the attention of the defense system.For more stealth attack,a simple and efficient class discriminative universal adversarial attack method is proposed,which has an obvious attack effect on textual examples from the targeted classes and limited influence on the non-targeted classes.In the case of white-box attack,multiple candidate perturbation sequences are searched by using the average gradient of the perturbation sequence in each batch.The perturbation sequence with the smallest loss is selected for the next iteration until no new perturbation sequence is generated.Comprehensive experiments are conducted on four public Chinese and English datasets and TextCNN,BiLSTM to evaluate the effectiveness of the proposed method.Experimental results show that the proposed attack method can discriminatively attack the targeted and non-targeted classes,and has certain transferability.

Key words: Class discriminative, Deep learning, Neural Networks, Text classification, Universal adversarial attack

中图分类号:

TP183

郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法[J]. 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077

HAO Zhi-rong, CHEN Long, HUANG Jia-cheng. Class Discriminative Universal Adversarial Attack for Text Classification[J]. Computer Science, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077

参考文献

[1]LIANG B,LI H,SU M,et al.Deep text classification can be fooled[C]//Proceedings of the 27th International Joint Confe-rence on Artificial Intelligence.Stockholm:AAAI Press,2018:4208-4215.
[2]WANG W Q,WANG R,WANG L N,et al.Adversarial examples generation approach for tendency classification on Chinese texts[J].Ruan Jian Xue Bao/Journal of Software,2019,30(8):2415-2427.
[3]TONG X,WANG L N,WANG R Z,et al.A Generation Method of Word-level Adversarial Samples for Chinese Text Classification[J].Netinfo Security,2020,20(9):12-16.
[4]CHENG M,YI J,CHEN P Y,et al.Seq2sick:Evaluating the robustness of sequence-to-sequence models with adversarial examples[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI Press,2020,34(4):3601-3608.
[5]ILYAS A,SANTURKAR S,TSIPRAS D,et al.Adversarial examples are not bugs,they are features[C]//Proceedings of the 33rd International Conference on Neural Information Processing Systems.Red Hook:Curran Associates Inc,2019:125-136.
[6]ZHANG C,BENZ P,IMTIAZ T,et al.Understanding adve-rsarial examples from the mutual influence of images and perturbations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.Seattle:IEEE Press,2020:14521-14530.
[7]TONG X,WANG B J,WANG R Z,et al.Survey on Adversarial Sample of Deep Learning Towards Natural Language Processing[J].Computer Science,2021,48(1):258-267.
[8]BEHJATI M,MOOSAVI-DEZFOOLI S M,BAGHSHAH M S,et al.Universal adversarial attacks on text classifiers[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).Brighton:IEEE Press,2019:7345-7349.
[9]WALLACE E,FENG S,KANDPAL N,et al.Universal Adversarial Triggers for Attacking and Analyzing NLP[C]//Procee-dings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP).Hong Kong:ACL Press,2019:2153-2162.
[10]SONG L,YU X,PENG H T,et al.Universal Adversarial Attacks with Natural Triggers for Text Classification[C]//Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.Online:ACL Press,2021:3724-3733.
[11]MOOSAVI-DEZFOOLI S M,FAWZI A,FAWZI O,et al.Universal adversarial perturbations[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:1765-1773.
[12]HEIDENREICH H S,WILLIAMS J R.The Earth Is Flat and the Sun Is Not a Star:The Susceptibility of GPT-2 to Universal Adversarial Triggers[C]//Proceedings of the 2021 AAAI/ACM Conference on AI,Ethics,and Society.New York:ACM Press,2021:566-573.
[13]GUPTA T,SINHA A,KUMARI N,et al.A method for computing class-wise universal adversarial perturbations[J].arXiv:1912.00466,2019.
[14]ZHANG C,BENZ P,IMTIAZ T,et al.Cd-uap:Class discriminative universal adversarial perturbation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.New York:AAAI Press,2020,34(4):6754-6761.
[15]BENZ P,ZHANG C,IMTIAZ T,et al.Double targeted universal adversarial perturbations[C]//Proceedings of the Asian Conference on Computer Vision.Kyoto:ACCVPress,2020:1-17.
[16]EBRAHIMI J,RAO A,LOWD D,et al.HotFlip:White-Box Adversarial Examples for Text Classification[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2:Short Papers).Melbourne:ACL Press,2018:31-36.
[17]SOCHER R,PERELYGIN A,WU J,et al.Recursive deep models for semantic compositionality over a sentiment treebank[C]//Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing.Seattle:ACL Press,2013:1631-1642.
[18]ZHANG X,ZHAO J,LECUN Y.Character-level convolutional networks for text classification[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems (Volume 1).Cambridge,MA,USA:MIT Press,2015:649-657.
[19]LI L,SHAO Y,SONG D,et al.Generating Adversarial Examples in Chinese Texts Using Sentence-Pieces[J].arXiv:2012.14769,2020.
[20]KIM Y.Convolutional neural networks for sentence classification[C]//Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).Doha:ACL Press,2014.1746-1751.
[21]MCCANN B,BRADBURY J,XIONG C,et al.Learned inTranslation:Contextualized Word Vectors[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems.Red Hook,NY,USA:Curran Associates Inc,2017:6297-6308.
[22]PENNINGTON J,SOCHER R,MANNING C D.Glove:Global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Proces-sing(EMNLP).Doha:ACL Press,2014:1532-1543.
[23]LI S,ZHAO Z,HU R,et al.Analogical Reasoning on Chinese Morphological and Semantic Relations[C]//Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Short Papers).Melbourne:ACL Press,2018:138-143.

相关文章 15

[1]	饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[2]	宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[3]	汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[5]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[6]	徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[7]	武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航. 监督和半监督学习下的多标签分类综述 Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning 计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[8]	李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[9]	王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[10]	姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[11]	王润安, 邹兆年. 基于物理操作级模型的查询执行时间预测方法 Query Performance Prediction Based on Physical Operation-level Models 计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[12]	陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[13]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[14]	孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[15]	檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

面向文本分类的类别区分式通用对抗攻击方法

Class Discriminative Universal Adversarial Attack for Text Classification

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 15

Metrics

本文评价

推荐阅读 0