位置增强与频域分量交互的深度伪造检测方法

doi:10.11896/jsjkx.250700070

摘要/Abstract

摘要： 随着深度伪造技术的快速发展,伪造人脸图像和视频在社交媒体上频繁出现。然而,这些技术也被恶意利用,严重威胁社会安全。现有检测方法在已知数据集的伪造人脸检测中表现良好,但在面对未知数据集的伪造人脸时,检测效果却显著下降。针对这一问题,提出了一种位置增强与频域分量交互的深度伪造检测方法,旨在提高深度伪造人脸检测算法的鲁棒性及泛化性。首先,采用Vision Transformer作为骨干网络,从全局角度捕捉伪造痕迹;其次,设计动态局部特征提取模块,利用卷积进行逐通道逐点局部特征提取,并根据每个像素在特征表示中的重要性进行动态加权,精细化局部特征,提高对局部特征的感知能力;同时,构建多尺度特征提取与位置增强模块,采用多膨胀率卷积获取多尺度特征,引入位置增强机制强化像素间的位置信息关联,有效提取不同区域的多尺度信息;然后,设计全局-局部频域分量交互模块,通过频域分解注意力机制实现不同频域分量之间的信息交互,捕捉全局与局部特征之间的依赖关系,以获取在伪造人脸图像质量下降时RGB空间中消失的伪影;最后,设计像素关系相似度损失函数计算像素间的位置关系损失,并结合交叉熵损失函数构建联合损失函数,提高深度伪造人脸检测的准确性。实验结果表明,所提方法在FF++和Celeb-DF数据集上的AUC指标分别达到99.29%和78.62%,其能有效提升深度伪造人脸检测算法的鲁棒性与泛化性。

关键词: 特征提取, 位置增强, 频域分量交互, 联合损失, 深度伪造检测

Abstract: With the rapid development of Deepfake technology,forged facial images and videos generated by such techniques have become increasingly prevalent on social media platforms.However,these technologies are also being maliciously exploited,posing serious threats to social security.Although existing detection methods perform well in detecting Deepfake faces on in-domain datasets,their performance significantly degrades when applied to unseen datasets.To address this issue,a Deepfake detection method based on positional enhancement and frequency domain component interaction is proposed,aiming to improve the robustness and generalization of facial forgery detection.Firstly,vision Transformer is employed as the backbone network to capture forgery traces from a global perspective.Secondly,the dynamic local feature extraction module is designed,utilizing channel-wise and point-wise convolutional operations for local feature extraction.This module dynamically weights features based on pixel-level importance in feature representation,thereby refining local features and enhancing the ability to perceive local features.Concurrently,the multi-scale feature extraction and positional enhancement module is constructed,which acquires multi-scale features through multi-dilated convolutions and introduces a positional enhancement mechanism to strengthen positional correlations between pixels,effectively extracting multi-scale information from different regions.Then,the global-local frequency domain component interaction module is developed,implementing information exchange between different frequency components through the frequency domain decomposition attention mechanism.This captures dependencies between global and local features to identify artifacts that disappear in RGB space when fake facial image quality degrades.Finally,the pixel relationship similarity loss function is designed to calculate positional relationship losses between pixels and is combined with cross-entropy loss to construct the joint loss function to improve detection accuracy.Experimental results demonstrate that the proposed method achieves AUC scores of 99.29% and 78.62% on FF++ and Celeb-DF datasets respectively,proving its effectiveness in enhancing the robustness and generalization of facial forgery detection.

Key words: Feature extraction, Positional enhancement, Frequency domain component interaction, Joint loss, Deepfake detection

中图分类号:

TP391.41

孟思雨, 牛春翔, 谭荃戈, 王蓉. 位置增强与频域分量交互的深度伪造检测方法[J]. 计算机科学, 2026, 53(4): 445-453. https://doi.org/10.11896/jsjkx.250700070

MENG Siyu, NIU Chunxiang, TAN Quange, WANG Rong. Deepfake Detection Method Based on Positional Enhancement and Frequency Domain ComponentInteraction[J]. Computer Science, 2026, 53(4): 445-453. https://doi.org/10.11896/jsjkx.250700070

参考文献

[1]THIES J,ZOLLHÖFER M,NIESSNER M.Deferred neuralrendering:image synthesis using neural textures[J].ACM Transactions on Graphics,2019,38(4):66.
[2]THIES J,ZOLLHÖFER M,STAMMINGER M,et al.Face2-Face:Real-Time Face Capture and Reenactment of RGB Videos[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition.2016:2387-2395.
[3]ZHAO H,WEI T,ZHOU W,et al.Multi-attentional deepfake detection[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2021:2185-2194.
[4]ZHANG D,CHEN J,LIAO X,et al.Face Forgery Detection via Multi-Feature Fusion and Local Enhancement[J].IEEE Transa-ctions on Circuits and Systems for Video Technology,2024,34(9):8972-8977.
[5]GUO Z,WANG L,YANG W,et al.LDFnet:Lightweight Dynamic Fusion Network for Face Forgery Detection by Integrating Local Artifacts and Global Texture Information[J].IEEE Transactions on Circuits and Systems for Video Technology,2024,34(2):1255-1265.
[6]ZHANG K,FAN Z X.Improved Face Forgery Detection MethodBased on Adversarial Training[J].Journal of Chongqing Technology and Business University(Na-tural Science Edition),2025,42(4):88-94.
[7]WANG Y M,HU J,WU X S,et al.Compressed deepfake video detection method based on inconsistent facial motion[J].Journal of Chongqing University of Posts and Telecommunications(Na-tural Science Edition),2025,37(3):445-452.
[8]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//Proceedings of the 31st International Confe-rence on Neural Information Processing Systems.2017:6000-6010.
[9]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:transformers for image recognition at scale[C]//Proceedings of the International Conference on Learning Representations.2021.
[10]ZHOU J,ZHAO X,XU Q,et al.MDCF-Net:Multi-Scale Dual-Branch Network for Compressed Face Forgery Detection[J].IEEE Access,2024,12:58740-58749.
[11]KHORMALI A,YUAN J S.Self-Supervised Graph Transformerfor Deepfake Detection[J].IEEE Access,2024,12:58114-58127.
[12]ZHOU K,SUN G,WANG J,et al.MH-FFNet:Leveraging Mid-High Frequency Information for Robust Fine-Grained Face Forgery Detection[J].Expert Systems with Applications,2025,276(C).
[13]LAI Z M,ZHANG Y,LI D,et al.Leveraging high-frequency diversified augmentation for general deepfake detection[J].Journal of Information Security and Applications,2025,89:103994.
[14]ZHANG D Y,QI F F,CHEN J H,et al.Fake face detection based on fusion of spatial texture and high-frequency noise[J].Chinese Journal Of Electronics,2025,34(1):212-221.
[15]HUANG J S,YANG G M.Face Forgery Detection MethodBased on Manipulation Trace Fusion[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2025,42(4):80-87.
[16]MIAO C,CHU Q,LI W,et al.Towards Generalizable and Robust Face Manipulation Detection via Bag-of-feature[C]//2021 International Conference on Visual Communications and Image Processing.2021:1-5.
[17]RÖSSLER A,COZZOLINO D,VERDOLIVA L,et al.Facefo-rensics++:Learning to detect manipulated facial images[C]//IEEE/CVF International Conference on Computer Vision(ICCV 2019).2019:1-11.
[18]WANG J,WU Z,OUYANG W,et al.M2TR:Multi-modalMulti-scale Transformers for Deepfake Detection[C]//Procee-dings of the 2022 International Conference on Multimedia Retrieval.2022:615-623.
[19]LI Y,YANG X,SUN P,et al.Celeb-df:A large-scale challenging dataset for Deepfake forensics[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR 2020).2020:3204-3213.
[20]DOLHANSKY B,HOWES R,PFLAUM B,et al.The Deepfake detection challenge(DFDC) preview dataset[J].arXiv:1910.08854,2019.
[21]ZI B,CHANG M,CHEN J,et al.WildDeepfake:A challenging real-world dataset for Deepfake detection[C]//Proceedings of the 28th ACM International Conference on Multimedia.2020:2382-2390.
[22]YANG X,LI Y,LYU S.Exposing deep fakes using inconsistent head poses[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019).2019:8261-8265.
[23]DENG J,GUO J,VERVERAS E,et al.Retinaface:Single-shot multi-level face localisation in the wild[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:5202-5211.
[24]LIU H,LI X,ZHOU W,et al.Spatial-phase shallow learning:Rethinking face forgery detection in frequency domain[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2021:772-781.
[25]BADR N E A,NEBEL J C,GREENHILL D,et al.WaViT-CDC:Wavelet Vision Transformer With Central Difference Convolutions for Spatial-Frequency Deepfake Detection[J].IEEE Open Journal of Signal Processing,2025,6:621-630.
[26]MIAO C,TAN Z,CHU Q,et al.Hierarchical Frequency-Assisted Interactive Networks for Face Manipulation Detection[J].IEEE Transactions on Information Forensics and Security,2022,17:3008-3021.
[27]WANG J,SUN Y,TANG J.Lisiam:Localization invariance Siamese network for Deepfake detection[J].IEEE Transactions on Information Forensics and Security,2022,17:2425-2436.
[28]MIAO C,TAN Z,CHU Q,et al.F2Trans:High-FrequencyFine-Grained Transformer for Face Forgery Detection[J].IEEE Transactions on Information Forensics and Security,2023,18:1039-1051.
[29]ZHUANG W,CHU Q,TAN Z,et al.UIA-ViT:Unsupervised Inconsistency-Aware Method Based on Vision Transformer for Face Forgery Detection[C]//Computer Vision-ECCV 2022.2022:391-407.
[30]GAO J,MICHELETTO M,ORRÙ G,et al.Texture and Artifact Decomposition for Improving Generalization in Deep-Lear-ning-Based Deepfake Detection[J].Engineering Applications of Artificial Intelligence,2024,133(C):108450.
[31]GONG R,HE R,ZHANG D,et al.Robust face forgery detection integrating local texture and global texture information[J].EURASIP Journal on Information Security,2025,2025(3):1-14.
[32]ZHAO Y,JIN X,GAO S,et al.TAN-GFD:generalizing face forgery detection based on textureinformation and adaptive noise mining[J].Applied Intelligence,2023,53:19007-19027.
[33]JIANG Q,LIU S,MIAO S,et al.Robust manipulated media localization and detection based on high frequency and texture features[J].Discover Computing,2025,28(1):1-17.
[34]TIAN J H,CHEN P,YU C,et al.Learning to Discover Forgery Cues for Face Forgery Detection[J].IEEE Transactions on Information Forensics and Security,2024,19:3814-3828.
[35]ZHENG J S,ZHOU Y C,ZHANG N,et al.A Spatio-Frequency Cross Fusion Model for Deepfake Detection and Segmentation[J].Neurocomputing,2025,628:129683.
[36]LUO A,KONG C,HUANG J,et al.Beyond the Prior Forgery Knowledge:Mining Critical Clues for General Face Forgery Detection[J].IEEE Transactions on Information Forensics and Security,2024,19:1168-1182.
[37]DONG F,ZOU X,WANG J,et al.Contrastive Learning-Based General Deepfake Detection with Multi-Scale RGB Frequency Clues[J].Journal of King Saud University-Computer and Information Sciences,2023,35(4):90-99.
[38]SELVARAJU R,COGSWELL M,DAS A,et al.Grad-CAM:Visual Explanations from Deep Networks via Gradient-Based Localization[C]//2017 IEEE International Conference on Computer Vision(ICCV).2017:618-626.

相关文章 15

[1]	赵斌贝, 朱力, 赵红礼, 李雨彤. 计算机视觉在轨道交通中的应用 Computer Vision Applications in Rail Transit Systems 计算机科学, 2026, 53(3): 214-224. https://doi.org/10.11896/jsjkx.250400009
[2]	郭星星, 肖雁南, 温佩芝, 徐智, 黄文明. 基于注意力机制的音频驱动数字人脸视频生成方法 Attention-based Audio-driven Digital Face Video Generation Method 计算机科学, 2026, 53(2): 245-252. https://doi.org/10.11896/jsjkx.241200067
[3]	李梦茜, 高心丹, 李雪. 双向特征图增强的图卷积网络算法 Two-way Feature Augmentation Graph Convolution Networks Algorithm 计算机科学, 2025, 52(7): 127-134. https://doi.org/10.11896/jsjkx.240600090
[4]	刘远红, 毋毓斌. 基于概率模型与信息熵的局部线性嵌入算法 Local Linear Embedding Algorithm Based on Probability Model and Information Entropy 计算机科学, 2025, 52(6A): 240500021-8. https://doi.org/10.11896/jsjkx.240500021
[5]	伍智华, 程江华, 刘通, 蔡亚辉, 程榜, 潘乐昊. 激光透窗低质量成像人体目标检测算法 Human Target Detection Algorithm for Low-quality Laser Through-window Imaging 计算机科学, 2025, 52(6A): 240600069-6. https://doi.org/10.11896/jsjkx.240600069
[6]	苗壮, 崔浩然, 张启阳, 王家宝, 李阳. 基于对比学习的大气湍流退化图像复原方法 Restoration of Atmospheric Turbulence-degraded Images Based on Contrastive Learning 计算机科学, 2025, 52(5): 171-178. https://doi.org/10.11896/jsjkx.240200020
[7]	孔煜, 熊风光, 张志强, 申超凡, 胡明月. 基于深度位置感知Transformer的低重叠点云配准 Low Overlap Point Cloud Registration Method Based on Deep Position-aware Transformer 计算机科学, 2025, 52(5): 199-211. https://doi.org/10.11896/jsjkx.240400172
[8]	李啸澜, 马勇. 渐进自适应特征融合的轻量化火焰检测算法研究 Study on Lightweight Flame Detection Algorithm with Progressive Adaptive Feature Fusion 计算机科学, 2025, 52(4): 64-73. https://doi.org/10.11896/jsjkx.241000093
[9]	王萌威, 杨哲. 基于子频带前端模型和反向特征融合的说话人确认方法 Speaker Verification Method Based on Sub-band Front-end Model and Inverse Feature Fusion 计算机科学, 2025, 52(3): 214-221. https://doi.org/10.11896/jsjkx.240100222
[10]	左旭洪, 王永全, 邱格屏. 基于交易行为特征的证券配资账户识别集成模型研究 Study on Integrated Model of Securities Illegal Margin Trading Accounts Identification Based on Trading Behavior Characteristics 计算机科学, 2025, 52(2): 125-133. https://doi.org/10.11896/jsjkx.241000110
[11]	王康月, 程铭, 谢奕香, 邹小兵, 李明. 孤独症访谈场景下融入角色信息的说话人日志方法 Role-aware Speaker Diarization in Autism Interview Scenarios 计算机科学, 2025, 52(2): 231-241. https://doi.org/10.11896/jsjkx.240100059
[12]	缪霖, 沈宏静, 王丽, 曹祎文, 卢崇雨. 多体制异构航天测控数传资源集中管控平台设计 Design of Centralized Management and Control Platform for Multi-system HeterogeneousAerospace TT&C and Data Transmission Resources 计算机科学, 2025, 52(11A): 250200110-6. https://doi.org/10.11896/jsjkx.250200110
[13]	朱思凡, 朱国胜. 基于多尺度注意力的视网膜血管分割方法研究 Retinal Vessel Segmentation Based on Multi-scale Attention 计算机科学, 2025, 52(11A): 241200112-10. https://doi.org/10.11896/jsjkx.241200112
[14]	岳倩雯, 王东强, 张强. 融合自适应优化与多维聚焦的点云配准网络 Point Cloud Registration Network Integrating Adaptive Optimization and Multi-dimensional Focusing 计算机科学, 2025, 52(11A): 250100019-7. https://doi.org/10.11896/jsjkx.250100019
[15]	张伟, 蔡宇帆, 叶林涛, 刘大志. 基于特征提取增强和金字塔结构的实时Transformer小目标检测模型 Real-time Transformer Small Target Detection Model Based on Feature Extraction Enhancement and Pyramid Structure 计算机科学, 2025, 52(11A): 250100139-11. https://doi.org/10.11896/jsjkx.250100139

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed