基于量化数据特征统计的深伪图像检测研究

doi:10.11896/jsjkx.230300013

摘要/Abstract

摘要： 深度伪造技术因“低门槛、高效率、高仿真”等特性而被滥用于伪造身份,引发的个人信息安全问题给公共安全治理带来了严峻挑战。目前深度伪造图像主流检测以卷积特征为主,量化特征应用较少,基于量化特征占用空间小,运行成本低等优点,探究图像各颜色分量上的纹理、颜色特征与图像真伪的关联程度,筛选有效特征进行深伪图像自动检测,研究量化特征在深伪图像鉴定方面的应用价值。对深度伪造人脸数据集ForgeryNet中的40 000幅实验样本图像进行分组实验,提取各组图像在Gray,YCrCb,Lab,HSV和RGB颜色空间上的纹理特征和颜色特征,利用多元统计法筛选既具有显著差异又具有相关性的特征,然后用XGBoost、逻辑回归分类器、线性SVM、多层感知机和TabNet进行算法验证,并与主流卷积神经网络进行对比分析。在5类算法中,XGBoost和LSVM分类效果较好;MLP和LP效果较差;TabNet效果不稳定,受分类类型影响较大,检测精度在52%～89%之间。数理统计筛选所得特征下的深伪图像检测精度显著提高,在真伪图像组,在真伪图像组,XGBoost算法在筛选特征和纹理特征时的检测精度比所有特征时分别提高1.10%和1.43%,LSVM和MLP两种算法在纹理特征时的检测精度比在所有特征时分别提高了0.12%和0.10%。利用颜色空间下筛选的量化特征,其检测精度均高于主流卷积神经网络的检测精度,且纹理特征的检测结果优于颜色特征,对身份替换深伪图像更易识别。相比图像卷积特征,量化特征具有较强的解释性,在鉴定领域具有较高的利用价值。

关键词: 图像纹理特征, 图像颜色特征, 深伪检测, 数据统计, 算法对比

Abstract: Due to the characteristics of “low threshold,high efficiency and high simulation”,deepfake technology is abused to forge identity,the personal information security problems caused by it are bringing serious challenges to public security gover-nance.At present,the mainstream detection of deepfake images is mainly convolution features,while quantitative features are rarely used,which have the advantages of small space and low operation cost.This paper explores the correlation degree of the texture,color features and image authenticity of the images,selects the effective features for the automatic detection of deepfake images,and studies the application value of the quantitative features in the deepfake images identification.40 000 images in the ForgeryNet dataset are used as experimental samples,which are divided into four groups.Texture features and color features in Gray,YCrCb,Lab,HSV and RGB color space of each group of images are extracted,and features with both significant difference and correlation are screened by Mann-Whitney U test and point biserial correlation analysis.Then XGBoost,logistic regression classifier,linear SVM,multilayer perceptron and TabNet are used to verify the seleted features,and finally compared with the mainstream convolutional neural network.Among the five algorithms,MLP and LP are less effective.XGBoost and LSVM are better.TabNet is unstable and greatly affected by classification type,with accuracy ranging from 52% to 89%.The accuracy of the features selected based on mathematical statistics is improved.For example,in the true and false image group,the screening features and texture features in the verification of XGBoost is 1.10% and 1.43% higher than all the features,respectively.The accuracy of texture features verified by LSVM and MLP improves by 0.12% and 0.10%,respectively.The accuracy of the structured feature algorithm based on screening is higher than that of the mainstream convolutional neural network,and the result of texture features is better than that of color features.It is easier to recognize the deepfake image with identity replacement.

Key words: Image texture features, Image color features, Deep fake detection, Data Statistics, Algorithm comparison

中图分类号:

TP309

谢菲, 高树辉. 基于量化数据特征统计的深伪图像检测研究[J]. 计算机科学, 2023, 50(11A): 230300013-9. https://doi.org/10.11896/jsjkx.230300013

XIE Fei, GAO Shuhui. Deepfake Images Detection Based on Quantitative Data Features Statistics[J]. Computer Science, 2023, 50(11A): 230300013-9. https://doi.org/10.11896/jsjkx.230300013

参考文献

[1]KIETZMANN J H,MILLS A J,PLANGGER K.Deepfakes:perspectives on the future “reality” of advertising and branding[J].International Journal of Advertising,2020,40:473-485.
[2]FANG Y M.Challenges and Solutions of Deepfake to the Security of Face Recognition Payment System[J].FinTech Time,2020,28(3):13-17.
[3]GUO J L,WANG H R,DOU J S,et al.Development and military application of deep forgery generation and recognition technology[C]//The first Conference of Systems Engineering－New Generation of Intelligent Technology and Systems Engineering.2019:11.
[4]PENG C L,GAO X B,WANG N N,et al.Deep visual identity forgery and detection[J].Scientia Sinica(Informationis),2021,51(9):1451-1474.
[5]LIU Z,QI X,JIA J,et al.Global Texture Enhancement for Fake Face Detection in the Wild[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:8057-8066.
[6]MCCLOSKEY S,ALBRIGHT M.Detecting GAN-generatedImagery using Color Cues[J].arXiv:1812,08247,2018.
[7]ZHOU P,HAN X,MORARIUV I,et al.Two-Stream NeuralNetworks for Tampered Face Detection[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).2017:1831-1839.
[8]LI Y,LYU S.Exposing DeepFake Videos By Detecting FaceWarping Artifacts[J].arXiv:1811.00656,2019.
[9]KHALID H,WOO S S.OC-FakeDect:Classifying DeepfakesUsing One-class Variational Autoencoder[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops(CVPRW).2020:2794-2803.
[10]CHEN C,MCCLOSKEY S,YU J.Focus Manipulation Detection via Photometric Histogram Analysis[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.2018:1674-1682.
[11]CHANG X,WU J,YANG T,et al.DeepFake Face Image Detection based on Improved VGG Convolutional Neural Network[C]//The 39th China Control Conference.Shenyang,China,2020.
[12]MATERN F,RIESS C,STAMMINGER M.Exploiting Visual Artifacts to Expose Deepfakes and Face Manipulations[C]//2019 IEEE Winter Applications of Computer Vision Workshops(WACVW).2019:83-92.
[13]SONGSRI-IN K,ZAFEIRIOU S.Complement Face ForensicDetection and Localization with FacialLandmarks[J].arXiv:1910.05455,2019.
[14]YANG X,LI Y,QI H,et al.Exposing GAN-synthesized Faces Using Landmark Locations[C]//Proceedings of the ACM Workshop on Information Hiding and Multimedia Security.2019.
[15]YANG X,LI Y,LYU S.Exposing Deep Fakes Using Inconsistent Head Poses[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2019 ).2019:8261-8265.
[16]HU S,LI Y,LYU S.Exposing GAN-Generated Faces Using In-consistent Corneal Specular Highlights[C]//2021 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP 2021).2021:2500-2504.
[17]LEE S Y,TARIQ S,SHIN Y,et al.Detecting handcrafted facial image manipulations and GAN-generated facial images using Shallow-FakeFaceNet[J].Applied Soft Computing,2021,105:107256.
[18]LI H,LI B,TAN S,et al.Identification of deep network generated images using disparities in color components[J].Signal Processing,2020,174:107616.
[19]LI L,BAO J,ZHANG T,et al.Face X-Ray for More GeneralFace Forgery Detection[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2020:5000-5009.
[20]HE P,LI H,WANG H.Detection of Fake Images Via The Ensemble of Deep Representations from Multi Color Spaces[C]//2019 IEEE International Conference on Image Processing (ICIP).2019:2299-2303.
[21]ZHU X T,TANG Y Q,GENG P Z.Detection Algorithm ofTamper and Deepfake Images Based on Feature Fusion[J].Netinfo Security,2021.21(8):70-81.
[22]YANG Y X,ZHOU X,XIONG S H,et al.Research on deepfakes detection combining traditional features and neural network[J].Information Technology and Network Security,2021,40(2):33-38.
[23]ZHANG Y,JIN X,JIANG Q,et al.Deepfake image detectionmethod based on autoencoder[J].Journal of Computer Application,2021,41(10):2985-2990.
[24]DURALL R,KEUPER M,PFREUNDT F,et al.UnmaskingDeepFakes with Simple Features[J].arXiv:1911.00686,2019.
[25]PENG S F,CAI M C,MA R,et al.Deepfake detection algorithm for high-frequency components of shallow features[J/OL].Laser & Optoelectronics Progress.[2022-12-25].http://kns.cnki.net/kcms/detail/31.1690.TN.20220713.1942.593.html.
[26]QIAN Y,YIN G,SHENG L,et al.Thinking in Frequency:Face Forgery Detection by Mining Frequency-aware Clues[C]//ECCV.2020.
[27]WANG L N,NIE J S,WANG R,et al.Analyzing deepfake pro-venance and forensics[J].Journal of Tsinghua University (Science and Technology),2022(5):62.
[28]YU N,DAVIS L,FRITZ M.Attributing Fake Images to GANs:Learning and Analyzing GAN Fingerprints[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:7555-7565.
[29]OLIVER G,LUCA G,SEBASTIANO B.Fighting Deepfakes by Detecting GAN DCT Anomalies[J].Journal of Imaging,2021,7(8):128.
[30]HSU C,LEE C,ZHUANG Y.Learning to Detect Fake FaceImages in the Wild[C]//2018 International Symposium on Computer,Consumer and Control(IS3C).2018:388-391.
[31]GUARNERA L,GIUDICE O,BATTIATO S.Fighting Deep-fake by Exposing the Convolutional Traces on Images[J].IEEE Access,2020,8:165085-165098.
[32]WANG S,WANG O,ZHANG R,et al.CNN-Generated Images Are Surprisingly Easy to Spot& for Now[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:8692-8701.
[33]HE Y,GAN B,CHEN S,et al.ForgeryNet:A Versatile Benchmark for Comprehensive Forgery Analysis[J].arXiv:2103.05630,2021.
[34]YANG S,CHEN F.Analyzing Sentiments of Micro-blog Posts Based on Support Vector Machine[J].Data Analysis and Knowledge Discovery,2017,1(2):73-79.
[35]ZHAO J X.Detect of Internet Fake Public Opinion Based on Decision Tree[J].Data Analysis and Knowledge Discovery,2015,259(6):78-84.
[36]CHEN T,GUESTRIN C.XGBoost:A Scalable Tree Boosting System[C]//Proceedings of the 22nd ACM SIGKDD Interna-tional Conference on Knowledge Discovery and Data Mining.2016.
[37]ARIK S Ö,PFISTERT.TabNet:Attentive Interpretable Tabular Learning[J].arXiv:1908.07442,2021.
[38]CAO S H,LIU X H,MAO X Q,et al.A review of human face forgery and forgery-detection technologies[J].Journal of Image and Graphics,2022,27(4):1023-1038.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed