基于多尺度Transformer融合多域信息的伪造人脸检测

doi:10.11896/jsjkx.220900048

Abstract

Abstract: At present,the proliferation of “face-changing” fake videos generated based on deep forgery technologies such as Deepfakes poses a considerable threat to citizens' privacy and national political security.Therefore,it is of great significance to study deep-faked face detection technology in videos.Aiming at the problems of insufficient extraction of facial features and weak gene-ralization ability of existing forged face detection methods,this paper proposes a fake face detection method based on multi-scale Transformer for the fusion of multi-domain information.First,based on the idea of multi-domain feature fusion,feature extraction from the frequency domain and RGB domain of video frames improves the generalization of the model.Second,the EfficientNet and multi-scale Transformer are combined to design a multi-level feature extraction network to extract more elaborate forged features.The test results on open-source datasets show that the proposed method has better detection performance than the existing methods.At the same time,experimental results on cross-datasets prove that the proposed model has better generalization performance.

Key words: Forgery face detection, Multi-scale Transformer, EfficientNet, Frequency domain features, Feature fusion

CLC Number:

TP391

MA Xin, JI Lixin, LI Shaomei. Forgery Face Detection Based on Multi-scale Transformer Fusing Multi-domain Information[J].Computer Science, 2023, 50(10): 112-118.

References

[1]LI X R,JI S L,WU C M,et al.Survey on deepfakes and detection techniques[J].Journal of Software,2021,32(2):496-518.
[2]Big Data Digest.Deepfake ‘involved in the war' for the first time:The Ukrainian president was faked to surrender the video,and the rumor was dispelled on Twitter[EB/OL].https://www.thepaper.cn/newsDetail_forward_17262083.
[3]ZHOU P,HAN X,MORARIU V I,et al.Two-Stream Neural Networks for Tampered Face Detection[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops.Piscataway:IEEE Press,2017:1831-1839.
[4]NGUYEN H,FANG F,YAMAGISHI J,et al.Capsule-foren-sics:Using Capsule Networks to Detect Forged Images and Vi-deos[C]//Proceedings of the 2019 IEEE International Confe-rence on Acoustics Speech and Signal Processing.Piscataway:IEEE Press,2019:2307-2311.
[5]HSU C C,ZHUANG Y X,LEE C Y.Deep Fake Image Detection Based on Pairwise Learning[J/OL].Applied Sciences,2020,10(1):370.http://doi.org/10.33901app10010370.
[6]TARIQ S,LEE S,KIM H,et al.Detecting Both Machine and Human Created Fake Face Images In the Wild[C]//Proceedings of the 2nd International Workshop on Multimedia Privacy and Security.Canada:CCS Press,2018:81-87.
[7]DAI Z,YANG Z,YANG Y,et al.Transformer-XL:Attentive Language Models beyond a Fixed-Length Context[C]//Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics.Florence,Italy.2019:2978-2988.
[8]HAN K,WANG Y,CHEN H,et al.A Survey on Visual Trans-former[J].arXiv.2012.12556,2020.
[9]WODAJO D,ATNAFU S.Deepfake Video Detection UsingConvolutional Vision Transformer[J].arXiv:2102.11126,2021.
[10]COCCOMINI D A,MESSINA N,GENNARO C,et al.Combining EfficientNet and Vision Transformers for Video Deepfake Detection[C]//Proceedings of the 21st International Conference on Image Analysis and Processing.Cham:Springer,2022:219-229.
[11]HEO Y J,CHOI Y J,LEE Y W,et al.Deepfake DetectionScheme Based on Vision Transformer and Distillation[J].ar-Xiv:2104.01353,2021.
[12]WANG J,WU Z,CHEN J,et al.M2TR:Multi-modal Multi-scale Transformers for Deepfake Detection[C]//Proceedings of the 2022 International Conference on Multimedia Retrieval.New York:ACM Press,2022:615-623.
[13]LIU H,LI X,ZHOU W,et al.Spatial-Phase Shallow Learning:Rethinking Face Forgery Detection in Frequency Domain[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Piscataway:IEEE Press,2021:772-781.
[14]TAN M,LE Q.EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks[C]//Proceedings of the 2019 International Conference on Machine Learning.Piscataway:IEEE Press,2019:6105-6114.
[15]PU Y,GAN Z,HENAO R,et al.Variational Autoencoder forDeep Learning of Images,Labels and Captions[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2016:2360-2368.
[16]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative Adversarial Nets[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.Cambridge:MIT Press,2014:2672-2680.
[17]ROSSLER A,COZZOLINO D,VERDOLIVA L,et al.FaceForensics++:Learning to Detect Manipulated Facial Images[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision.New York:IEEE Press,2019:1-11.
[18]LI Y,YANG X,SUN P,et al.Celeb-DF:A Large-scale Challenging Dataset for Deepfake Forensics[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:3204-3213.
[19]DENG J,GUO J,ZHOU Y,et al.RetinaFace:Single-stageDense Face Localisation in the Wild[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:5202-5211.
[20]LI L,BAO J,ZHANG T,et al.Face X-Ray for More GeneralFace Forgery Detection[C]//Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2020:5000-5009.
[21]CHOLLET F.Xception:Deep Learning with Depthwise Separable Convolutions[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2017:1800-1807.
[22]TAN M,LE Q V.EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks[C]//Proceedings of the 36th International Conference on Machine Learning.New York:PMLR Press,2019:6105-6114.
[23]MASI I,KILLEKAR A,RM MASCAREN,et al.Two Branch Recurrent Network for Isolating Deepfakes in Videos[C]//Proceedings of the 2020 European Conference on Computer Vision.Cham:Springer,2020:667-684.
[24]ZHAO H,ZHOU W,CHEN D,et al.Multi-attentional Deepfake Detection[C]//Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Press,2021:2185-2194.
[25]AFCHAR D,NOZICK V,YAMAGISHI J,et al.MesoNet:aCompact Facial Video Forgery Detection Network[C]//Proceedings of the 2018 IEEE International Workshop on Information Forensics and Security.New York:IEEE Press,2018:1-7.
[26]LI Y,LYU S.Exposing DeepFake Videos By Detecting FaceWarping Artifacts[C]//Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.California:CVPR workshop,2019:46-52.
[27]LI X,LANG Y,CHEN Y,et al.Sharp Multiple Instance Lear-ning for DeepFake Video Detection[C]//Proceedings of the 28th ACM International Conference on Multimedia.New York:ACM Press,2020:1864-1872.
[28]SELVARAJU R,COGSWELL M,DAS A,et al.Grad-cam:Vi-sual explanations from deep networks via gradient-based localization[C]//Proceedings of the 2017 IEEE International Confe-rence on Computer Vision.USA:IEEE Press,2017:618-626.
[29]MAATEN L V D.Accelerating t-SNE using tree-based algorithms[J].Journal of Machine Learning Research,2014,15(1):3221-3245.

Related Articles 15

[1]	CHEN Guojun, YUE Xueyan, ZHU Yanning, FU Yunpeng. Study on Building Extraction Algorithm of Remote Sensing Image Based on Multi-scale Feature Fusion [J]. Computer Science, 2023, 50(9): 202-209.
[2]	ZHOU Fengfan, LING Hefei, ZHANG Jinyuan, XIA Ziwei, SHI Yuxuan, LI Ping. Facial Physical Adversarial Example Performance Prediction Algorithm Based on Multi-modal Feature Fusion [J]. Computer Science, 2023, 50(8): 280-285.
[3]	SHAN Xiaohuan, SONG Rui, LI Haihai, SONG Baoyan. Event Recommendation Method with Multi-factor Feature Fusion in EBSN [J]. Computer Science, 2023, 50(7): 60-65.
[4]	WANG Tianran, WANG Qi, WANG Qingshan. Transfer Learning Based Cross-object Sign Language Gesture Recognition Method [J]. Computer Science, 2023, 50(6A): 220300232-5.
[5]	WU Liuchen, ZHANG Hui, LIU Jiaxuan, ZHAO Chenyang. Defect Detection of Transmission Line Bolt Based on Region Attention Mechanism andMulti-scale Feature Fusion [J]. Computer Science, 2023, 50(6A): 220200096-7.
[6]	LUO Huilan, LONG Jun, LIANG Miaomiao. Attentional Feature Fusion Approach for Siamese Network Based Object Tracking [J]. Computer Science, 2023, 50(6A): 220300237-9.
[7]	DOU Zhi, HU Chenguang, LIANG Jingyi, ZHENG Liming, LIU Guoqi. Lightweight Target Detection Algorithm Based on Improved Yolov4-tiny [J]. Computer Science, 2023, 50(6A): 220700006-7.
[8]	ZHANG Changfan, MA Yuanyuan, LIU Jianhua, HE Jing. Dual Gating-Residual Feature Fusion for Image-Text Cross-modal Retrieval [J]. Computer Science, 2023, 50(6A): 220700030-7.
[9]	WANG Wei, BAI Long, MA Huanchang, LIU Yanheng. Study on Safety Warning Method of Driver’s Blind Area Based on Machine Vision [J]. Computer Science, 2023, 50(6A): 220700141-7.
[10]	RUAN Wang, HAO Guosheng, WANG Xia, HU Xiaoting, YANG Zihao. Fusion Multi-feature Fuzzy Model for Target Recognition and Its Application [J]. Computer Science, 2023, 50(6A): 220100138-7.
[11]	LIU Zhe, LIANG Yudong, LI Jiaying. Adaptive Image Dehazing Algorithm Based on Dynamic Convolution Kernels [J]. Computer Science, 2023, 50(6): 200-208.
[12]	JIA Tianhao, PENG Li. SSD Object Detection Algorithm with Residual Learning and Cyclic Attention [J]. Computer Science, 2023, 50(5): 170-176.
[13]	BAI Xuefei, MA Yanan, WANG Wenjian. Segmentation Method of Edge-guided Breast Ultrasound Images Based on Feature Fusion [J]. Computer Science, 2023, 50(3): 199-207.
[14]	XIE Qinqin, HE Lang, XU Ruli. Classification of Oil Painting Art Style Based on Multi-feature Fusion [J]. Computer Science, 2023, 50(3): 223-230.
[15]	HUA Jie, LIU Xueliang, ZHAO Ye. Few-shot Object Detection Based on Feature Fusion [J]. Computer Science, 2023, 50(2): 209-213.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Forgery Face Detection Based on Multi-scale Transformer Fusing Multi-domain Information

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0