计算机科学 ›› 2023, Vol. 50 ›› Issue (8): 280-285.doi: 10.11896/jsjkx.221100124

• 信息安全 • 上一篇    下一篇

基于多模态特征融合的人脸物理对抗样本性能预测算法

周风帆1, 凌贺飞1, 张锦元2, 夏紫薇1, 史宇轩1, 李平1   

  1. 1 华中科技大学计算机科学与技术学院 武汉 430074
    2 中国工商银行软件开发中心 广东 珠海 519080
  • 收稿日期:2022-11-15 修回日期:2023-03-04 出版日期:2023-08-15 发布日期:2023-08-02
  • 通讯作者: 凌贺飞(lhefei@hust.edu.cn)
  • 作者简介:(ffzhou@hust.edu.cn)
  • 基金资助:
    国家自然科学基金(61972169);国家重点研发计划(2019QY(Y)0202,2022YFB2601802);湖北省重点研发计划(2022BAA046,2022BAA042);武汉基础研究知识创新项目(2020010601012182);中国博士后科学基金(2022M711251)

Facial Physical Adversarial Example Performance Prediction Algorithm Based on Multi-modal Feature Fusion

ZHOU Fengfan1, LING Hefei1, ZHANG Jinyuan2, XIA Ziwei1, SHI Yuxuan1, LI Ping1   

  1. 1 School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China
    2 Software Development Center,Industrial and Commercial Bank of China,Zhuhai,Guangdong 519080,China
  • Received:2022-11-15 Revised:2023-03-04 Online:2023-08-15 Published:2023-08-02
  • About author:ZHOU Fengfan,born in 1998,Ph.D.His main research interest is adversarial attacks on face recognition.
    LING Hefei,born in 1976,Ph.D supervisor.His main research interest is computer vision.
  • Supported by:
    Natural Natural Science Foundation of China(61972169),National Key Research and Development Program of China(2019QY(Y)0202,2022YFB2601802),Major Scientific and Technological Project of Hubei Province(2022BAA046,2022BAA042),Research Programme on Applied Fundamentals and Frontier Technologies of Wuhan(2020010601012182) and China Postdoctoral Science Foundation(2022M711251).

摘要: 人脸物理对抗样本攻击(Facial Physical Adversarial Attack,FPAA)指攻击者通过粘贴或佩戴物理对抗样本,如打印的眼镜、纸片等,在摄像头下被识别成特定目标的人脸,或者让人脸识别系统无法识别的攻击方式。已有FPAA的性能评测会受到多种环境因素的影响,且需要多个人工操作的环节,导致性能评测效率非常低下。为了减少人脸物理对抗样本性能评测方面的工作量,结合数字图片和环境因素之间的多模态性,提出了多模态特征融合预测算法(Multimodal Feature Fusion Prediction Algorithm,MFFP)。具体地,使用不同的网络提取攻击者人脸图片、受害者人脸图片和人脸数字对抗样本图片的特征,使用环境特征网络来提取环境因素中的特征,然后使用一个多模态特征融合网络对这些特征进行融合,多模态特征融合网络的输出即为所预测的人脸物理对抗样本图片和受害者图片之间的余弦相似度。MFFP算法在未知环境、未知FPAA算法的实验场景下取得了0.003的回归均方误差,其性能优于对比算法,验证了MFFP算法对FPAA性能预测的准确性,可以对FPAA性能进行快速评估,同时大幅降低人工操作的工作量。

关键词: 人工智能安全, 对抗样本, 人脸物理对抗样本攻击, 性能预测, 多模态特征融合

Abstract: Facial physical adversarial attack(FPAA) refers to a method that an attacker pasting or wearing physical adversary examples,such as printed glasses,paper,to make the face recognition system to recognize his face as the face of a specific target,or make the face recognition system unable to recognize his face under the camera.The existing performance evaluation process of the FPAA can be affected by multiple environmental factors and require multiple manual operations,resulting in very low efficiency of performance evaluation.In order to reduce the workload of evaluating the performance of facial physical adversarial examples,combined with the multimodality between digital images and environmental factors,a multimodal feature fusion prediction algorithm(MFFP) is proposed.Specifically,different networks are used to extract the features of attacker's face images,victim's face images and facial digital adversarialexample images,and the proposed environmental feature extraction network is used to extract the features of environmental factors.A multimodal feature fusion network is proposed to fuse these features.The output of the multimodal feature fusion network is the cosine similarity performance between the predicted facial physical adversarial example image and the victim image.MFFP algorithm achieves a regression mean square error of 0.003 under the experimental scenario of unknown environment and unknown FPAA,which is better than the performance of the baseline.It verifies the accuracy of MFFP algorithm for predicting of the performance of FPAA.Moreover,it verifies that MFFP can quickly evaluate the performance of FPAA,while greatly reduce the workload of manual operation.

Key words: Artificial intelligence security, Adversarial example, Facial physical adversarial attack, Performance prediction, Multimodal feature fusion

中图分类号: 

  • TP391
[1]SZEGEDY C,ZAREMBA W,SUTSKEVER I,et al.Intriguing properties of neural networks[C]//International Conference on Learning Representations.2014:1-10.
[2]QIU H N,XIAO C W,YANG L,et al.SemanticAdv:Generating Adversarial Examples via Attribute-conditional Image Editing[C]//European Conference on Computer Vision.Springer.2020:19-37.
[3]SHEN M,YU H,ZHU L H,et al.Effective and Robust Physical-World Attacks on Deep Learning Face Recognition Systems [J].IEEE Transactions on Information Forensics and Security,2021,16:4063-4077.
[4]SATO T,SHEN J J,WANG N F,et al.Dirty Road Can Attack:Security of Deep Learning based Automated Lane Centering under Physical-World Attack[C]//USENIX Security Symposium.USENIX Association.2021:3309-3326.
[5]DUAN R J,MAO X F,QIN K.A,et al.Adversarial laser beam:Effective physical-world attack to DNNs in a blink[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2021:16062-16071.
[6]DONG Y P,LIAO F Z,PANG T Y,et al.Boosting adversarial attacks with momentum[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2018:9185-9193.
[7]XIE C H,ZHANG Z S,ZHOU Y Y,et al.Improving transfer-ability of adversarial examples with input diversity[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE,2019:2730-2739.
[8]ZHONG Y Y,DENG W H.Towards transferable adversarial attack against deep face recognition [J].IEEE Transactions on Information Forensics and Security,2021,16:1452-1466.
[9]YANG X,DONG Y P,PANG T Y,et al.Towards face encryption by generating adversarial identity masks[C]//International Conference on Computer Vision.IEEE,2021:3897-3907.
[10]SHARIF M,BHAGAVATULA S,BAUER L,et al.Accessorize to a Crime:Real and Stealthy Attacks on State-of-the-Art Face Recognition[C]//{ACM} {SIGSAC} Conference on Computer and Communications Security.ACM,2016:1528-1540.
[11]KOMKOV S,PETIUSHKO A.AdvHat:Real-World Adversa-rial Attack on ArcFace Face {ID} System[C]//International Conference on Pattern Recognition.IEEE,2020:819-826.
[12]YIN B J,WANG W X.YAO T P,et al.Adv-Makeup:A New Imperceptible and Transferable Attack on Face Recognition[C]//International Joint Conference on Artificial Intelligence.2021:1252-1258
[13]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[C]//International Conference on Learning Representations.2021:1-21.
[14]TOLSTIKHIN I,HOULSBY N,KOLESNIKOV A,et al.Mlp-mixer:An all-mlp architecture for vision[C]//Advances in Neural Information Processing Systems.MIT Press,2021:24261-24272.
[15]DU L,GAO F,CHEN X,et al.TabularNet:A Neural Network Architecture for Understanding Semantic Structures of Tabular Data[C]//ACM SIGKDD Conference on Knowledge Discovery &Data Mining.ACM,2021:322-331.
[16]TANM X,QUOC V L E.Efficientnet:Rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning.PMLR,2019:6105-6114.
[17]HORNIK K,STINCHCOMBE M,WHITE H.Multilayer feed-forward networks are universal approximators [J].Neural Networks,1989,2:359-366.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!