Computer Science ›› 2023, Vol. 50 ›› Issue (1): 114-122.doi: 10.11896/jsjkx.211100269

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Multitask Transformer-based Network for Image Splicing Manipulation Detection

ZHANG Jingyuan, WANG Hongxia, HE Peisong   

  1. School of Cyber Science and Engineering,Sichuan University,Chengdu 610065,China
  • Received:2021-12-29 Revised:2022-03-01 Online:2023-01-15 Published:2023-01-09
  • About author:ZHANG Jingyuan,born in 1996,postgraduate.Her main research interests include digital image forensics and deep learning.
    HE Peisong,born in 1991,Ph.D,asso-ciate professor.His main researchin-terests include multimedia security and deep learning.
  • Supported by:
    Science and Technology Program of Sichuan Province(2022YFG0320),National Natural Science Foundation of China(61902263,61972269),Fundamental Research Funds for Central Universities of Ministry of Education of China(YJ201881,2020SCU12066) and China Postdoctoral Science Foundation(2020M673276).

Abstract: Most of existing deep learning-based methods for image splicing forgery detection use convolutional layer for forensics feature extraction.However,convolution kernel conducts the local computation process with the limited reception field.More-over,existing methods mainly apply the location of tampering regions to guide the detection model to train,and it is difficult to learn richer tamper trace features.To overcome above-mentioned limitations,a multitask transformer-based network(MT-Net) is proposed for image splicing detection and localization.The self-attention mechanism of Transformer is leveraged in encoder to learn the pixel correlation,which is able to provide different attention levels for pixels and makes the detection network pay more attention to tampering traces.Meanwhile,MT-Net considers three subtasks simultaneously to guide the detection network expose tampering traces from both local and global information,including tampered edge detection,tampered area detection and the prediction of the tampered area's proportion.Finally,three specific loss functions for their corresponding subtask are designed to better optimize the detection network in the training phase.In experiments,the proposed method(MT-Net) achieves better detection results compared with other state-of-the-art methods on three public available datasets,including CASIA v2.0,Columbia and IDM2020,where F1 scores are 0.808,0.913 and 0.675 respectively.The visualization results also demonstrate that the proposed method has the better capability of localizing the splicing regions.

Key words: Digital image forensics, Image splicing detection, Transformer, Self-attention mechanism, Multitask network

CLC Number: 

  • TP391
[1]LIU Y,WANG H X,CHEN Y,et al.A passive forensic scheme for copy-move forgery based on superpixel segmentation and K-means clustering[J].Multimedia Tools and Applications,2020,79(1/2):477-500.
[2]MAHDIAN B,SAIC S.Using noise inconsistencies for blindimage forensics[J].Image and Vision Computing,2009,27(10):1497-1503.
[3]HOU J U,LEE H K.Detection of Hue modification using photo response nonuniformity[J].IEEE Transactions on Circuits and Systems for Video Technology,2017,27(8):1826-1832.
[4]FERRARA P,BIANCHI T,ROSA A D,et al.Image forgery localization via fine-grained analysis of CFA artifacts[J].IEEE Transactions on Information Forensics and Security,2012,7(5):1566-1577.
[5]CHEN C,MCCLOSKEY S,YU J.Image splicing detection via camera response function analysis[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:1876-1885.
[6]LIU B,PUN C M.Deep fusion network for splicing forgery localization[C]//European Conference on Computer Vision(ECCV).2019:237-251.
[7]BI X L,WEI Y,XIAO B,et al.RRU-net:The ringed residual U-net for image splicing forgery detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR) Workshops.2019:30-39.
[8]WU Y,ABDALMAGEED W,NATARAJAN P.Mantra-net:Manipulation tracing network for detection and localization of image forgeries with anomalous features[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2019:9543-9552.
[9]ZHOU J H,NI J Q,RAO Y.Block-based convolutional neural network for image forgery detection[C]//International Workshop on Digital Watermarking(IWDW).2017:1-10.
[10]BAPPY J H,SIMONS C,NATARAJ L,et al.Hybrid LSTM and encoder-decoder architecture for detection of image forgeries[J].IEEE Transactions on Image Processing,2019,28(7):3286-3300.
[11]ZHOU P,HAN X T,MORARIU V I,et al.Learning rich features for image manipulation detection[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2018:1053-1061.
[12]DIRIK A E,MEMON N.Image tamper detection based on demosaicing artifact[C]//16th IEEE International Conference on Image Processing(ICIP).2009:1497-1500.
[13]LIN Z C,HE J F,TANG X O,et al.Fast,automatic and fine-grained tampered JPEG image detection via DCT coefficient analysis[J].Pattern Recognition,2019,42(11):2492-2501.
[14]KRAWETZ N.A picture's worth:digital image analysis and forensics[EB/OL].[2021-11-29].http://hackerfactor.org/papers/bh-usa-07-krawetz-wp.pdf.
[15]RAO Y,NI J Q.A deep learning approach to detection of splicing and copy-move forgeries in images[C]//2016 IEEE International Workshop on Information Forensics and Security(WIFS).2016:1-6.
[16]FRIDRICH J,KODOVSKY J.Rich models for steganalysis of digital images[J].IEEETransactions on Information Forensics and Security,2012,7(3):868-882.
[17]CUN X D,PUN C M.Image splicing localization via semi-global network and fully connected conditional random fields[C]//European Conference on Computer Vision(ECCV).2019:252-266.
[18]BAPPY J H,ROY-CHOWDHURY A K,BUNK J,et al.Exploiting Spatial Structure for Localizing Manipulated Image Regions[C]//IEEE International Conference on Computer Vision(ICCV).2017:4980-4989.
[19]KWON M J,YU I J,NAM S H,et al.CAT-Net:Compression artifact tracing network for detection and localization of image splicing[C]//IEEE Winter Conference on Applications of Computer Vision(WACV).2021:375-384.
[20]SALLOUM R,REN Y Z,JAY K C C.Image splicing localization using a multi-task fully convolutional network(MFCN)[J].Journal of Visual Communication and Image Representation,2018(51):201-209.
[21]KNIAZ V V,KNYAZ V A,REMONDINO F.The point where reality meets fantasy:Mixed adversarial generators for image splice detection[C]//35th Conference on Neural Information Processing Systems(NeurIPS).2019:215-226.
[22]BI X L,ZHANG Z P,LIU Y B,et al.Multi-Task wavelet corrected network for image splicing forgery detection and localization[C]//IEEE International Conference on Multimedia and Expo(ICME).2021:1-6.
[23]VASWANI A,SHAZEER N,PARMAR N,et al.Attention isall you need[C]//31st International Conference on Neural Information Processing Systems.2017:6000-6010.
[24]XIE E,WANG W H,YU Z D,et al.SegFormer:Simple and efficient design for semantic segmentation with Transformers[C]//Neural Information Processing Systems(NeurIPS).2021:1-18.
[25]HENDRYCKS D,GIMPEL K.Gaussianerror linear units(GELUs)[J].arXiv:1606.08415,2016.
[26]WEI J,WANG S H,HUANG Q M.F3Net:Fusion,feedbackand focus for salient object detection[C]//34th AAAI Confe-rence on Artificial Intelligence.2020:12321-12328.
[27]HE K M,ZHANG X Y,REN S Q,et al.Spatial pyramid pooling in deep convolutional networks for visual recognition[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,37(9):1904-1916.
[28]RAHMAN M A,WANG Y.Optimizing intersection-over-union in deep neural networks for image segmentation[C]//International Symposium on Visual Computing(ISVC).2016:234-244.
[29]DONG J,WANG W,TAN T,CASIA image tampering detection evaluation database[C]//2013 IEEE China Summit and International Conference on Signal and Information Processing.2013:422-426.
[30]HSU Y F,CHANG S F.Detecting image splicing using geometry invariants and camera characteristics consistency[C]//IEEE International Conference on Multimedia and Expo(ICME).2006:549-552.
[31]NOVOZAMSKY A,MAHDIAN B,SAIC S.Imd2020:A large-scale annotated dataset tailored for detecting manipulated images[C]//IEEE Winter Conference on Applications of Computer Vision(WACV) Workshops.2020:71-80.
[32]HE K M,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:770-778.
[1] CAI Xiao, CEHN Zhihua, SHENG Bin. SPT:Swin Pyramid Transformer for Object Detection of Remote Sensing [J]. Computer Science, 2023, 50(1): 105-113.
[2] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[3] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[4] ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[5] KANG Yan, XU Yu-long, KOU Yong-qi, XIE Si-yu, YANG Xue-kun, LI Hao. Drug-Drug Interaction Prediction Based on Transformer and LSTM [J]. Computer Science, 2022, 49(6A): 17-21.
[6] ZHAO Xiao-hu, YE Sheng, LI Xiao. Multi-algorithm Fusion Behavior Classification Method for Body Bone Information Reconstruction [J]. Computer Science, 2022, 49(6): 269-275.
[7] ZHAO Dan-dan, HUANG De-gen, MENG Jia-na, DONG Yu, ZHANG Pan. Chinese Entity Relations Classification Based on BERT-GRU-ATT [J]. Computer Science, 2022, 49(6): 319-325.
[8] LU Liang, KONG Fang. Dialogue-based Entity Relation Extraction with Knowledge [J]. Computer Science, 2022, 49(5): 200-205.
[9] WANG Shuai, ZHANG Shu-jun, YE Kang, GUO Qi. Continuous Sign Language Recognition Method Based on Improved Transformer [J]. Computer Science, 2022, 49(11A): 211200198-6.
[10] HU Xin-rong, CHEN Zhi-heng, LIU Jun-ping, PENG Tao, YE Peng, ZHU Qiang. Sentiment Analysis Framework Based on Multimodal Representation Learning [J]. Computer Science, 2022, 49(11A): 210900107-6.
[11] LI Chuan, LI Wei-hua, WANG Ying-hui, CHEN Wei, WEN Jun-ying. Gated Two-tower Transformer-based Model for Predicting Antigenicity of Influenza H1N1 [J]. Computer Science, 2022, 49(11A): 211000209-6.
[12] FANG Zhong-jun, ZHANG Jing, LI Dong-dong. Spatial Encoding and Multi-layer Joint Encoding Enhanced Transformer for Image Captioning [J]. Computer Science, 2022, 49(10): 151-158.
[13] HU Yan-li, TONG Tan-qian, ZHANG Xiao-yu, PENG Juan. Self-attention-based BGRU and CNN for Sentiment Analysis [J]. Computer Science, 2022, 49(1): 252-258.
[14] YANG Hui-min, MA Ting-huai. Compound Conversation Model Combining Retrieval and Generation [J]. Computer Science, 2021, 48(8): 234-239.
[15] YANG Jin-cai, CAO Yuan, HU Quan, SHEN Xian-jun. Relation Classification of Chinese Causal Compound Sentences Based on Transformer Model and Relational Word Feature [J]. Computer Science, 2021, 48(6A): 295-298.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!