基于多尺度融合注意力的多视角文档图像篡改检测与定位

doi:10.11896/jsjkx.240100142

Abstract

Abstract: With the improvement and application of various digital platforms,document images have been widely spread on the Internet.At the same time,the development of image processing technology has increased the risk of document image tampering,making it crucial to ensure the integrity and authenticity of document images.In this paper,we propose multi-view and multi-scale fusion attention network(MM-Net),aiming for improving the accuracy of document image forgery localization in real-world.We adopt multi-view encoder combined with RGB information,noise information,and character information to fully extract tampering features.A multi-scale fusion attention module is designed to facilitate the interaction of multi-scale features,thus enhancing important content information in document images.Extensive experimental results on the large-scale dataset DocTamper demonstrate that the proposed MM-Net achieves more precise localization of tampered regions in document images,with F-score of 0.809,0.807,and 0.774 on the test dataset,cross domain dataset FCD and SCD,respectively.Moreover,MM-Net exhibits good generalizability and robustness.

Key words: Document image forgery detection, Deep learning, Multi-scale, Digital image forensics, Multi-view

CLC Number:

TP391

MENG Sijiang, WANG Hongxia, ZENG Qiang, ZHOU Yang. Multi-view and Multi-scale Fusion Attention Network for Document Image Forgery Localization[J].Computer Science, 2025, 52(4): 327-335.

References

[1]WU Y,ABDALMAGEED W,NATARAJAN P.ManTra-Net:Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE CompSoc,2019:9535-9544.
[2]KWON M J,YU I J,NAM S H,et al.CAT-Net:Compression Artifact Tracing Network for Detection and Localization of Image Splicing[C]//IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE Comp Soc,2021:375-384.
[3]LI S,XU S,MA W,et al.Image Manipulation Localization Using Attentional Cross-Domain CNN Features[J].IEEE Transactions on Neural Networks and Learning Systems,2021,34(9):5614-5628.
[4]CHEN X,DONG C,JI J,et al.Image Manipulation Detection byMulti-View Multi-Scale Supervision[C]//IEEE/CVF International Conference on Computer Vision(ICCV).IEEE Comp Soc,2021:14165-14173.
[5]WU H,ZHOU J,TIAN J,et al.Robust Image Forgery Detection Against Transmission Over Online Social Networks[J].IEEETransactions on Information Forensics and Security,2022,17:443-456.
[6]CHEN J,LIAO X,WANG W,et al.SNIS:A Signal Noise Separation-Based Network for Post-Processed Image Forgery Detection[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(2):935-951.
[7]CRUZ F,SIDÈRE N,COUSTATY M,et al.Local Binary Patterns for Document Forgery Detection[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR):Vol.01.IEEE Comp Soc,2017:1223-1228.
[8]BERTRAND R,TERRADES O R,GOMEZ-KRÄMER P,et al.A Conditional Random Field model for font forgery detection[C]//2015 13th International Conference on Document Analysis and Recognition(ICDAR).IEEE Comp Soc,2015:576-580.
[9]GORNALE S S,PATIL G,BENNE R.Document Image ForgeryDetection Using RGB Color Channel[J].Transactions on Engineering and Computing Sciences,2022,10(5):1-14.
[10]XU W,LUO J,ZHU C,et al.Document images forgery localization using a two-stream network[J].International Journal of Intelligent Systems,2022,37(8):5272-5289.
[11]YANG P,FANG W,ZHANG F,et al.Document Image Forgery Detection Based on Deep Learning Models[C]//2022 International Symposium on Electrical,Electronics and Information Engineering(ISEEIE).World Scientific,2022:36-41.
[12]JAISWAL G,SHARMA A,YADAV S K.Deep feature extraction for document forgery detection with convolutional autoencoders[J].Computers and Electrical Engineering,2022,99:107770.
[13]QU C,LIU C,LIU Y,et al.Towards Robust Tampered Text Detection in Document Image:New Dataset and New Solution[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Comp Soc,2023:5937-5946.
[14]LIAO X,CHEN S,CHEN J,et al.CTP-Net:Character Texture Perception Network for Document Image Forgery Localization[J].arXiv:2308.02158,2023.
[15]FERRARA P,BIANCHI T,DE ROSA A,et al.Image Forgery Localization via Fine-Grained Analysis of CFA Artifacts[J].IEEE Transactions on Information Forensics and Security,2012,7(5):1566-1577.
[16]SUTCU Y,COSKUN B,SENCAR H T,et al.Tamper Detection Based on Regularity of Wavelet Transform Coefficients[C]//IEEE International Conference on Image Processing(ICIP 2007).IEEE Comp Soc,2007:397-400.
[17]BAPPY J H,SIMONS C,NATARAJ L,et al.Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries[J].IEEE Transactions on Image Processing,2019,28(7):3286-3300.
[18]ZHANG Z,ZHANG Y,ZHOU Z,et al.Boundary-Based Image Forgery Detection by Fast ShallowCNN[C]//International Conference on Pattern Recognition(ICPR).IEEE Comp Soc,2018:2658-2663.
[19]AMERINI I,URICCHIO T,BALLAN L,et al.Localization of JPEG Double Compression Through Multi-domain Convolutional Neural Networks[C]//IEEE/CVF Conferenceon Computer Vision and Pattern Recognition Workshops(CVPRW).IEEE Comp Soc,2017:1865-1871.
[20]LI W,LI X,NI R,et al.Quantization Step Estimation for JPEG Image Forensics[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(7):4816-4827.
[21]WANG H,WANG J,LUO X,et al.Detecting Aligned Double JPEG Compressed Color Image With Same Quantization Matrix Based on the Stability of Image[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(6):4065-4080.
[22]GOU H,SWAMINATHAN A,WU M.Noise Features forImage Tampering Detection and Steganalysis[C]//IEEE International Conference on Image Processing(ICIP 2007).IEEE Comp Soc,2007:97-100.
[23]MAHDIAN B,SAIC S.Using Noise Inconsistencies for BlindImage Forensics[J].Image and Vision Computing,2009,27(10):1497-1503.
[24]ZHOU P,HAN X,MORARIU V I,et al.Learning Rich Fea-tures for Image Manipulation Detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Comp Soc,2018:1053-1061.
[25]HU X,ZHANG Z,JIANG Z,et al.SPAN:Spatial Pyramid Attention Network for Image Manipulation Localization[C]//Computer Vision - ECCV 2020.Springer International Publishing,2020:312-328.
[26]YANG C,LI H,LIN F,et al.Constrained R-Cnn:A General Image Manipulation Detection Model[C]//IEEE International Conference on Multimedia and Expo(ICME).IEEE Comp Soc,2020:1-6.
[27]LUO Z,SHAFAIT F,MIAN A.Localized forgery detection in hyperspectral document images[C]//2015 13th International Conference on Document Analysis and Recognition(ICDAR).IEEE Comp Soc,2015:496-500.
[28]BIBI M,HAMID A,MOETESUM M,et al.Document Forgery Detection using Printer Source Identification－A Text-Independent Approach[C]//2019 International Conference on Document Analysis and Recognition Workshops(ICDARW):Vol.8.IEEE Comp Soc,2019:7-12.
[29]LIANG W,DONG L,WANG R,et al.Robust Document Image Forgery Localization Against Image Blending[C]//2022 IEEE International Conference on Trust,Security and Privacy in Computing and Communications(TrustCom).IEEE Comp Soc,2022:810-817.
[30]TAN M,LE Q.EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks[C]//Proceedings of the 36th International Conference on Machine Learning.PMLR,2019:6105-6114.
[31]SHI B,BAI X,YAO C.An End-to-End Trainable Neural Net-work for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304.
[32]LYU P,LIAO M,YAO C,et al.Mask TextSpotter:An End-to-End Trainable Neural Network for Spotting Text with ArbitraryShapes[C]//Proceedings of the European Conference on Computer Vision(ECCV).Springer,2018:67-83.
[33]QIAO Z,ZHOU Y,YANG D,et al.SEED:Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Comp Soc,2020:13528-13537.
[34]LI M,LV T,CHEN J,et al.TrOCR:Transformer-Based Optical Character Recognition with Pre-trained Models[J].Proceedings of the AAAI Conference on Artificial Intelligence,2023,37(11):13094-13102.
[35]CASTRO-BLEDA M J,ESPAÑA-BOQUERA S,PASTOR-PELLICER J,et al.The NoisyOffice Database:A Corpus To Train Supervised Machine Learning Filters For Image Proces-sing[J].The Computer Journal,2020,63(11):1658-1667.
[36]HUAWEI CLOUD.Huawei cloud visual information extraction competition[EB/OL].https://competition.huaweicloud.com/information/1000041696/introduction.
[37]BAO H,DONG L,PIAO S,et al.BEiT:BERT Pre-Training of Image Transformers[C]//International Conference on Learning Representations.OpenReview,2021.
[38]LIU Z,HU H,LIN Y,et al.Swin Transformer V2:Scaling Up Capacity and Resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Comp Soc,2022:12009-12019.

Related Articles 15

[1]	LI Xiwang, CAO Peisong, WU Yuying, GUO Shuming, SHE Wei. Study on Security Risk Relation Extraction Based on Multi-view IB [J]. Computer Science, 2025, 52(5): 330-336.
[2]	ZHENG Xu, HUANG Xiangjie, YANG Yang. Reversible Facial Privacy Protection Method Based on “Invisible Masks” [J]. Computer Science, 2025, 52(5): 384-391.
[3]	GAO Wei, WANG Lei, LI Jianan, LI Shuailong, HAN Lin. Operator Fusion Optimization for Deep Learning Compiler TVM [J]. Computer Science, 2025, 52(5): 58-66.
[4]	MIAO Zhuang, CUI Haoran, ZHANG Qiyang, WANG Jiabao, LI Yang. Restoration of Atmospheric Turbulence-degraded Images Based on Contrastive Learning [J]. Computer Science, 2025, 52(5): 171-178.
[5]	DENG Ceyu, LI Duantengchuan, HU Yiren, WANG Xiaoguang, LI Zhifei. Joint Inter-word and Inter-sentence Multi-relationship Modeling for Review-basedRecommendation Algorithm [J]. Computer Science, 2025, 52(4): 119-128.
[6]	WU Jie, WAN Yuan, LIU Qiujie. Consistent Block Diagonal and Exclusive Multi-view Subspace Clustering [J]. Computer Science, 2025, 52(4): 138-146.
[7]	PENG Linna, ZHANG Hongyun, MIAO Duoqian. Complex Organ Segmentation Based on Edge Constraints and Enhanced Swin Unetr [J]. Computer Science, 2025, 52(4): 177-184.
[8]	KONG Jialin, ZHANG Qi, WEI Jianze, LI Qi. Adaptive Contextual Learning Method Based on Iris Texture Perception [J]. Computer Science, 2025, 52(4): 185-193.
[9]	ZHOU Yi, MAO Kuanmin. Research on Individual Identification of Cattle Based on YOLO-Unet Combined Network [J]. Computer Science, 2025, 52(4): 194-201.
[10]	SUN Tanghui, ZHAO Gang, GUO Meiqian. Long-tail Distributed Medical Image Classification Based on Large Selective Nuclear Bilateral-branch Networks [J]. Computer Science, 2025, 52(4): 231-239.
[11]	LI Xiaolan, MA Yong. Study on Lightweight Flame Detection Algorithm with Progressive Adaptive Feature Fusion [J]. Computer Science, 2025, 52(4): 64-73.
[12]	SHEN Yaxin, GAO Lijian , MAO Qirong. Semi-supervised Sound Event Detection Based on Meta Learning [J]. Computer Science, 2025, 52(3): 222-230.
[13]	CHEN Guangyuan, WANG Zhaohui, CHENG Ze. Multi-view Stereo Reconstruction with Context-guided Cost Volume and Depth Refinemen [J]. Computer Science, 2025, 52(3): 231-238.
[14]	HAN Lin, WANG Yifan, LI Jianan, GAO Wei. Automatic Scheduling Search Optimization Method Based on TVM [J]. Computer Science, 2025, 52(3): 268-276.
[15]	WANG Tao, BAI Xuefei, WANG Wenjian. Selective Feature Fusion for 3D CT Image Segmentation of Renal Cancer Based on Edge Enhancement [J]. Computer Science, 2025, 52(3): 41-49.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Multi-view and Multi-scale Fusion Attention Network for Document Image Forgery Localization

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0