计算机科学 ›› 2023, Vol. 50 ›› Issue (11): 160-167.doi: 10.11896/jsjkx.221100109

• 计算机图形学&多媒体 • 上一篇    下一篇

基于非关键掩码和注意力机制的深度伪造人脸篡改视频检测方法

俞洋, 袁家斌, 蔡纪元, 查可可, 陈章屿, 戴加威, 冯煜翔   

  1. 南京航空航天大学计算机科学与技术学院 南京 211106
  • 收稿日期:2022-11-14 修回日期:2023-03-09 出版日期:2023-11-15 发布日期:2023-11-06
  • 通讯作者: 袁家斌(jbyuan@nuaa.edu.cn)
  • 作者简介:(yu_yang@nuaa.edu.cn)
  • 基金资助:
    国家自然科学基金(62076127)

Deepfake Face Tampering Video Detection Method Based on Non-critical Masks and AttentionMechanism

YU Yang, YUAN Jiabin, CAI Jiyuan, ZHA Keke, CHEN Zhangyu, DAI Jiawei, FENG Yuxiang   

  1. College of Computer Science and Technology,Nanjing University of Aeronautics and Astronautics,Nanjing 211106,China
  • Received:2022-11-14 Revised:2023-03-09 Online:2023-11-15 Published:2023-11-06
  • About author:YU Yang,born in 1995,postgraduate.His main research interests include deep learning and deepfake detection.YUAN Jiabin,born in 1968,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include high-perfor-mance computing,quantum computing,deep learning,medical image proces-sing,etc.
  • Supported by:
    National Natural Science Foundation of China(62076127).

摘要: 自深度伪造技术(Deepfake)被提出以来,其非法应用对个人、社会、国家安全造成了恶劣影响,存在巨大隐患,因此针对人脸视频的深度伪造检测是计算机视觉领域中的热点及难点问题。针对上述问题,提出了一种基于非关键掩码和CA_S3D模型的深度伪造视频检测方法。该方法首先将人脸图像划分为关键区域和非关键区域,通过对非关键区域掩码的处理,提高了深度神经网络对人脸图像关键区域的关注程度,减少了无关信息对深度神经网络的影响和干扰;接着在S3D网络中引入上下文注意力模块,增强了对样本数据信息长程依赖的捕获能力,提高了对关键通道和特征的关注程度。实验结果表明,该方法在DFDC数据集上得到了明显的性能提升,准确率从83.85%提升到了90.10%,AUC值从0.931提升到了0.979;同时与现有的深度伪造视频检测方法进行了对比,所提方法的表现优于现有方法,验证了该方法的有效性。

关键词: 深度伪造, Deepfake检测, 图像掩码, 三维卷积网络, 注意力机制

Abstract: Since the introduction of Deepfake technology,its illegal application has caused a bad impact on individuals,society and national security,and there are huge hidden dangers.Therefore,deep fake detection for face video is a hot and difficult problem in the field of computer vision.In view of the above problems,this paper proposes a deepfake video detection method based on non-critical mask and CA_S3D Model.It firstly divides the face image into key areas and non-critical regions,and improves the attention of the deep neural network to the key areas of the face image through the mask processing of the non-critical areas,and reduces the influence and interference of irrelevant information on the deep neural network.Then it introduces the contextual attention module in the S3D network,which enhances the ability to capture the long-range dependence of sample data information and improves the attention to key channels and features.Experimental results show that the proposed method improves the perfor-mance of the deep neural network on the DFDC dataset,the accuracy rate increases from 83.85% to 90.10%,and the AUC value increases from 0.931 to 0.979.By comparing with the existing deepfake video detection methods,the performance of the proposed method is better than that of the existing methods,which verifies its effectiveness.

Key words: Deepfake, Deepfake detection, Image mask, 3D CNNs, Mechanism of attention

中图分类号: 

  • TP391.41
[1]KINGMA D P,WELLING M.Auto-Encoding Variational Bayes[EB/OL].https://arxiv.org/abs/1312.6114.
[2]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Nets[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems-Volume 2.Cambridge,MA,USA:MIT Press,2014:2672-2680.
[3]AKHTAR Z,DASGUPTA D.A Comparative Evaluation of Local Feature Descriptors for DeepFakes Detection[C]//2019 IEEE International Symposium on Technologies for Homeland Security(HST 2019).2019:1-5.
[4]ZHANG Y,ZHENG L,THING V L L.Automated face swapping and its detection[C]//2017 IEEE 2nd International Confe-rence on Signal and Image Processing(ICSIP).2017:15-19.
[5]KORSHUNOV P,MARCEL S.DeepFakes:a New Threat toFace Recognition?Assessment and Detection[EB/OL].[2021-08-23].http://arxiv.org/abs/1812.08685.
[6]Faceswap[CP/OL].https://github.com/deepfakes/faceswap.
[7]DFaker[CP/OL].https://github.com/dfaker/df.
[8]DeepFaceLab[CP/OL].https://github.com/iperov/DeepFace-Lab.
[9]Faceswap-GAN[CP/OL].https://github.com/shaoanlu/faces-wap-GAN.
[10]NIRKIN Y,KELLER Y,HASSNER T.FSGAN:Subject Agnostic Face Swapping and Reenactment[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:7183-7192.
[11]KORSHUNOVA I,SHI W,DAMBRE J,et al.Fast Face-SwapUsing Convolutional Neural Networks[C]//2017 IEEE International Conference on Computer Vision(ICCV).2017:3697-3705.
[12]MATERN F,RIESS C,STAMMINGER M.Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations[C]//2019 IEEE Winter Applications of Computer Vision Workshops(WACVW).IEEE,2019.
[13]LI Y,YANG X,SUN P,et al.Celeb-DF:A Large-Scale Challenging Dataset for DeepFake Forensics[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:3204-3213.
[14]YU N,DAVIS L,FRITZ M.Attributing Fake Images to GANs:Learning and Analyzing GAN Fingerprints[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:7555-7565.
[15]AGARWAL S,FARID H,GU Y,et al.Protecting World Lea-ders Against Deep Fakes[C]//CVPR Workshops.2019:38-45.
[16]LI Y,CHANG M C,LYU S.In Ictu Oculi:Exposing AI Created Fake Videos by Detecting Eye Blinking[C]//2018 IEEE International Workshop on Information Forensics and Security(WIFS).2018:1-7.
[17]GUERA D,DELP E J.Deepfake Video Detection Using Recurrent Neural Networks[C]//2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS).IEEE,2018.
[18]LI Y,LYU S.Exposing DeepFake Videos By Detecting FaceWarping Artifacts[C]//IEEE Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).2019.
[19]CHUGH K,GUPTA P,DHALL A,et al.Not Made for Each Other- Audio-Visual Dissonance-Based Deepfake Detection and Localization[C]//Proceedings of the 28th ACM International Conference on Multimedia.New York,NY,USA:Association for Computing Machinery,2020:439-447.
[20]SCHROFF F,KALENICHENKO D,PHILBIN J.FaceNet:Aunified embedding for face recognition and clustering[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2015.
[21]O’TOOLE A J,JONATHON PHILLIPS P,JIANG F,et al.Face Recognition Algorithms Surpass Humans Matching Faces Over Changes in Illumination[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(9):1642-1646.
[22]NGUYEN H H,FANG F,YAMAGISHI J,et al.Multi-task Learning for Detecting and Segmenting Manipulated Facial Images and Videos[C]//2019 IEEE 10th International Conference on Biometrics Theory,Applications and Systems(BTAS).2019:1-8.
[23]AMERINI I,GALTERI L,CALDELLI R,et al.Deepfake Video Detection through Optical Flow Based CNN[C]//2019 IEEE/CVF International Conference on Computer Vision Workshop(ICCVW).IEEE,2019.
[24]SUN D,YANG X,LIU M Y,et al.PWC-Net:CNNs for Optical Flow Using Pyramid,Warping,and Cost Volume[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018:8934-8943.
[25]ALPARONE L,BARNI M,BARTOLINI F,et al.Regularization of optic flow estimates by means of weighted vector median filtering[J].IEEE Transactions on Image Processing,1999,8(10):1462-1467.
[26]LUO Z,KAMATA S I,SUN Z.Transformer and Node-Com-pressed Dnn Based Dual-Path System For Manipulated Face Detection[C]//2021 IEEE International Conference on Image Processing(ICIP).2021:3882-3886.
[27]XIE S,SUN C,HUANG J,et al.Rethinking SpatiotemporalFeature Learning:Speed-Accuracy Trade-Offs in Video Classification[C]//Computer Vision(ECCV 2018):15th European Conference.Munich,Germany,2018:318-335.
[28]CARREIRA J,ZISSERMAN A.Quo Vadis,Action Recogni-tion? A New Model and the Kinetics Dataset[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:4724-4733.
[29]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2015:1-9.
[30]CAO Y,XU J,LIN S,et al.GCNet:Non-local Networks Meet Squeeze-Excitation Networks and Beyond[EB/OL].[2022-09-28].http://arxiv.org/abs/1904.11492.
[31]NEHATE C,DALIA P,NAIK S,et al.Exposing DeepFakesusing Siamese Training[C]//2022 IEEE India Council International Subsections Conference(INDISCON).2022:1-6.
[32]COCCOMINI D A,MESSINA N,GENNARO C,et al.Combining EfficientNet and Vision Transformers for Video Deepfake Detection[C]//Image Analysis and Processing(ICIAP 2022).2022:219-229.
[33]WANG Z,LI X,NI R,et al.Attention Guided Spatio-Temporal Artifacts Extraction for Deepfake Detection[C]//Pattern Re-cognition and Computer Vision.2021:374-386.
[34]HE T,ZHANG Z,ZHANG H,et al.Bag of Tricks for ImageClassification with Convolutional Neural Networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2019:558-567.
[35]ROSSLER A,COZZOLINO D,VERDOLIVA L,et al.FaceFo-rensics++:Learning to Detect Manipulated Facial Images[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).IEEE,2019.
[36]HARA K,KATAOKA H,SATOH Y.Learning Spatio-Temporal Features with 3D Residual Networks for Action Recognition[C]//2017 IEEE International Conference on Computer Vision Workshops(ICCVW).2017:3154-3160.
[37]TRAN D,WANG H,FEISZLI M,et al.Video ClassificationWith Channel-Separated Convolutional Networks[C]//Procee-dings of the IEEE International Conference on Computer Vision.2019:5551-5560.
[38]HALIASSOS A,VOUGIOUKAS K,PETRIDIS S,et al.LipsDon’t Lie:A Generalisable and Robust Approach to Face Forgery Detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:5037-5047.
[39]Captum[CP/OL].https://captum.ai/.
[40]ZHAO T,XU X,XU M,et al.Learning Self-Consistency forDeepfake Detection[C]//2021 IEEE/CVF International Confe-rence on Computer Vision(ICCV).2021:15003-15013.
[41]BONDI L,DANIELE CANNAS E,BESTAGINI P,et al.Trai-ning Strategies and Data Augmentations in CNN-based DeepFake Video Detection[C]//2020 IEEE International Workshop on Information Forensics and Security(WIFS).2020:1-6.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!