计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 220100106-9.doi: 10.11896/jsjkx.220100106
李颖1, 边山1,2,3, 王春桃1,2, 黄琼1,2
LI Ying1, BIAN Shan1,2,3, WANG Chun-tao1,2, HUANG Qiong1,2
摘要: 深度伪造技术(Deepfake) 是一种基于生成对抗网络(Generative Adversarial Networks,GAN)的深度网络模型,可以利用源和目标人脸生成高度逼真且难以鉴别的人脸视频。如果不法分子借此技术制造虚假视频并在互联网上传播谣言,将会侵犯个人肖像权,造成不良的社会影响,甚至引发严重的司法纠纷。面对深度伪造技术带来的严重威胁,国内外众多研究机构高度关注深度伪造检测技术的研究并提出了若干检测方法。现有的检测方法在高质量视频上可以取得良好的检测效果,然而日常应用中的视频通常会通过社交软件从而被压缩为低质量视频,在此类低质量数据集中,现有的大多数伪造人脸检测方法的准确率有着明显的下降,并且现有方法在跨库情况下的检测性能也不够理想。文中针对现有工作的局限性,提出了一种注意力机制下基于Xception 模型的双流网络结构。该网络结构中包含了使用多重注意力机制的RGB 分支,以及用于捕捉低质量视频伪影效应的频率域分支。通过研究发现,真实图像与伪造图像之间的微小差别更多地集中在局部位置,因此多重注意力机制下的RGB 分支将使得模型关注人脸的不同区域,并在注意力图的指导下得到由低层纹理特征及高层语义特征聚合的全局特征。频率域分支引入离散余弦变换作为频域变换手段,为图像提供与RGB 分支互补的特征表示,此分支能够反映细微的伪造痕迹或者压缩误差。为了验证该网络结构的有效性,所提算法在FaceForensics++,Celeb-DF 以及DFDC 3个公开数据集上进行了大量对比实验。实验结果表明,所提算法在低质量视频集上的性能优于现有的检测算法,并且所提模型在跨库场景下具有更好的检测性能,即验证了文中提出的注意力机制下的RGB和频率域双流特征的结合可以提高检测模型在低质量视频集及跨库情形下的鲁棒性。
中图分类号:
[1]BRANDON J.Terrifying High-Tech Porn:Creepy 《deepfake》Videos Are on the Rise[EB/OL].Fox News,2018.(2018-02-16)[2021-06-27].https://www.foxnews.com/tech/terrifying-high-tech-porn-creepy-deepfake-videos-are-on-the-rise. [2]ROETTGERS J,ROETTGERS J.Porn Producers Offer to Help Hollywood Take Down Deepfake Videos[EB/OL].(2018-02-21)[2021-06-27].https://variety.com/2018/digital/news/deepfakes-porn-adult-industry-1202705749/. [3]GOODFELLOW I J,POUGET-ABADIE J,MIRZA M,et al.Generative Adversarial Networks[J/OL].arXiv:1406.2661,2014. [4]ZHAO H,ZHOU W,CHEN D,et al.Multi-attentional deepfake detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:2185-2194. [5]QIAN Y,YIN G,SHENG L,et al.Thinking in frequency:Face forgery detection by mining frequency-aware clues[C]//Euro-pean Conference on Computer Vision.Cham:Springer,2020:86-103. [6]AFCHAR D,NOZICK V,YAMAGISHI J,et al.Mesonet:acompact facial video forgery detection network[C]//2018 IEEE International Workshop on Information Forensics and Security(WIFS).IEEE,2018:1-7. [7]ZHOU P,HAN X,MORARIU V I,et al.Two-stream neural networks for tampered face detection[C]//2017 IEEE Confe-rence on Computer Vision and Pattern Recognition Workshops(CVPRW).IEEE,2017:1831-1839. [8]YU N,DAVIS L,FRITZ M.Attributing fake images to gans:Learning and analyzing gan fingerprints[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:7556-7566. [9]MCCLOSKEY S,ALBRIGHT M.Detecting GAN-GeneratedImagery Using Color Cues[J].arXiv:1812.08247,2018. [10]XIAO J,GONG L Y,HUANG T Q,et al.Deepfake swapped face detection based on double attention[J].Chinese Journal of Netword and Information Security,2021,7(2):151. [11]BIAN M Y,PENG B,WANG W,et al.Detection of low- quality facial deepfake image based on void convolution[J].Modern Electronics Technique,2021,44(6):133-138. [12]LI X R,YU K.A Deepfakes detection technique based on two-stream network[J].Journal of Cyber Security,2020,5(2):84-91. [13]BAO Y X,LU T L,DU Y H,et al.Deepfake VideosDetection Method Basedoni_ResNet34 Model and Data Augmentation[J].Comptuter Science,2021,48(7):77-85. [14]LIU H,LI X,ZHOU W,et al.Spatial-phase shallow learning:rethinking face forgery detection in frequency domain[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:772-781. [15]WANG J,WU Z,CHEN J,et al.M2TR:Multi-Modal Multi-ScaleTransformers for Deepfake Detection[J].arXiv:2104.09770,2021. [16]SZEGEDY C,LIU W,JIA Y Q,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9. [17]CHOLLET F.Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1251-1258. [18]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the inception architecture for computer vision[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2818-2826. [19]LI Y,YANG X,SUN P,et al.Celeb-df:A large-scale challenging dataset for deepfake forensics[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:3207-3216. [20]DOLHANSKY B,BITTON J,PFLAUM B,et al.The deepfake detection challenge(dfdc) dataset[J].arXiv:2006.07397,2020. [21]ROSSLER A,COZZOLINO D,VERDOLIVA L,et al.Facefo-rensics++:Learning to detect manipulated facial images[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2019:1-11. [22]COZZOLINO D,POGGI G,VERDOLIVA L.Recasting residual-based local descriptors as convolutional neural networks:an application to image forgery detection[C]//Proceedings of the 5th ACM Workshop on Information Hiding and Multimedia Security.2017:159-164. [23]LI Y,LYU S.Exposing deepfake videos by detecting face warping artifacts[J].arXiv:1811.00656,2018. [24]LI L,BAO J,ZHANG T,et al.Face x-ray for more general face forgery detection[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2020:5001-5010. [25]MASI I,KILLEKAR A,MARIAN R,et al.Two-branch recurrent network for isolating deepfakes in videos[C]//European Conference on Computer Vision.Cham:Springer,2020:667-684. [26]BONETTINI N,CANNAS E D,MANDELLI S,et al.Video face manipulation detection through ensemble of cnns[C]//2020 25th International Conference on Pattern Recognition(ICPR).IEEE,2021:5012-5019. [27]NGUYEN H H,FANG F,YAMAGISHI J,et al.Multi-tasklearning for detecting and segmenting manipulated facial images and videos[J].arXiv:1906.06876,2019. [28]NGUYEN H H,YAMAGISHI J,ECHIZEN I.Use of a capsule network to detect fake images and videos[J].arXiv:1910.12467,2019. [29]WODAJO D,ATNAFU S.Deepfake video detection using con-volutional vision transformer[J].arXiv:2102.11126,2021. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[3] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[4] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[8] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[9] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[10] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[11] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[12] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[13] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
[14] | 孟月波, 穆思蓉, 刘光辉, 徐胜军, 韩九强. 基于向量注意力机制GoogLeNet-GMP的行人重识别方法 Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism 计算机科学, 2022, 49(7): 142-147. https://doi.org/10.11896/jsjkx.210600198 |
[15] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
|