计算机科学 ›› 2025, Vol. 52 ›› Issue (4): 327-335.doi: 10.11896/jsjkx.240100142
孟思江, 王宏霞, 曾强, 周炀
MENG Sijiang, WANG Hongxia, ZENG Qiang, ZHOU Yang
摘要: 随着各类数字化平台的完善和应用,文档类图像在网络上得到了广泛传播。与此同时,图像处理技术的发展也增大了文档类图像被篡改的风险,保障文档图像的完整性和真实性变得至关重要。为了提高真实场景下文档类图像篡改区域定位的准确度,提出了一种基于多尺度融合注意力的多视角文档类图像篡改检测与定位方法(Multi-View and Multi-Scale Fusion Attention Network,MM-Net),采用多视角编码器结合RGB图像、噪声信息和字符特征信息,充分地挖掘篡改特征。此外,MM-Net设计多尺度融合注意力模块以实现不同尺度的特征交互,增强文档图像中的关键内容信息,从而提高文档类图像篡改区域定位的精度。在大规模数据集DocTamper上的大量实验结果表明,MM-Net实现了更精确的文档类图像篡改区域定位,在测试数据集、跨域数据集FCD和SCD上的F1值分别达到了0.809,0.807和0.774,并表现出了良好的泛化性和鲁棒性。
中图分类号:
[1]WU Y,ABDALMAGEED W,NATARAJAN P.ManTra-Net:Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE CompSoc,2019:9535-9544. [2]KWON M J,YU I J,NAM S H,et al.CAT-Net:Compression Artifact Tracing Network for Detection and Localization of Image Splicing[C]//IEEE Winter Conference on Applications of Computer Vision(WACV).IEEE Comp Soc,2021:375-384. [3]LI S,XU S,MA W,et al.Image Manipulation Localization Using Attentional Cross-Domain CNN Features[J].IEEE Transactions on Neural Networks and Learning Systems,2021,34(9):5614-5628. [4]CHEN X,DONG C,JI J,et al.Image Manipulation Detection byMulti-View Multi-Scale Supervision[C]//IEEE/CVF International Conference on Computer Vision(ICCV).IEEE Comp Soc,2021:14165-14173. [5]WU H,ZHOU J,TIAN J,et al.Robust Image Forgery Detection Against Transmission Over Online Social Networks[J].IEEETransactions on Information Forensics and Security,2022,17:443-456. [6]CHEN J,LIAO X,WANG W,et al.SNIS:A Signal Noise Separation-Based Network for Post-Processed Image Forgery Detection[J].IEEE Transactions on Circuits and Systems for Video Technology,2023,33(2):935-951. [7]CRUZ F,SIDÈRE N,COUSTATY M,et al.Local Binary Patterns for Document Forgery Detection[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR):Vol.01.IEEE Comp Soc,2017:1223-1228. [8]BERTRAND R,TERRADES O R,GOMEZ-KRÄMER P,et al.A Conditional Random Field model for font forgery detection[C]//2015 13th International Conference on Document Analysis and Recognition(ICDAR).IEEE Comp Soc,2015:576-580. [9]GORNALE S S,PATIL G,BENNE R.Document Image ForgeryDetection Using RGB Color Channel[J].Transactions on Engineering and Computing Sciences,2022,10(5):1-14. [10]XU W,LUO J,ZHU C,et al.Document images forgery localization using a two-stream network[J].International Journal of Intelligent Systems,2022,37(8):5272-5289. [11]YANG P,FANG W,ZHANG F,et al.Document Image Forgery Detection Based on Deep Learning Models[C]//2022 International Symposium on Electrical,Electronics and Information Engineering(ISEEIE).World Scientific,2022:36-41. [12]JAISWAL G,SHARMA A,YADAV S K.Deep feature extraction for document forgery detection with convolutional autoencoders[J].Computers and Electrical Engineering,2022,99:107770. [13]QU C,LIU C,LIU Y,et al.Towards Robust Tampered Text Detection in Document Image:New Dataset and New Solution[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Comp Soc,2023:5937-5946. [14]LIAO X,CHEN S,CHEN J,et al.CTP-Net:Character Texture Perception Network for Document Image Forgery Localization[J].arXiv:2308.02158,2023. [15]FERRARA P,BIANCHI T,DE ROSA A,et al.Image Forgery Localization via Fine-Grained Analysis of CFA Artifacts[J].IEEE Transactions on Information Forensics and Security,2012,7(5):1566-1577. [16]SUTCU Y,COSKUN B,SENCAR H T,et al.Tamper Detection Based on Regularity of Wavelet Transform Coefficients[C]//IEEE International Conference on Image Processing(ICIP 2007).IEEE Comp Soc,2007:397-400. [17]BAPPY J H,SIMONS C,NATARAJ L,et al.Hybrid LSTM and Encoder-Decoder Architecture for Detection of Image Forgeries[J].IEEE Transactions on Image Processing,2019,28(7):3286-3300. [18]ZHANG Z,ZHANG Y,ZHOU Z,et al.Boundary-Based Image Forgery Detection by Fast ShallowCNN[C]//International Conference on Pattern Recognition(ICPR).IEEE Comp Soc,2018:2658-2663. [19]AMERINI I,URICCHIO T,BALLAN L,et al.Localization of JPEG Double Compression Through Multi-domain Convolutional Neural Networks[C]//IEEE/CVF Conferenceon Computer Vision and Pattern Recognition Workshops(CVPRW).IEEE Comp Soc,2017:1865-1871. [20]LI W,LI X,NI R,et al.Quantization Step Estimation for JPEG Image Forensics[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(7):4816-4827. [21]WANG H,WANG J,LUO X,et al.Detecting Aligned Double JPEG Compressed Color Image With Same Quantization Matrix Based on the Stability of Image[J].IEEE Transactions on Circuits and Systems for Video Technology,2022,32(6):4065-4080. [22]GOU H,SWAMINATHAN A,WU M.Noise Features forImage Tampering Detection and Steganalysis[C]//IEEE International Conference on Image Processing(ICIP 2007).IEEE Comp Soc,2007:97-100. [23]MAHDIAN B,SAIC S.Using Noise Inconsistencies for BlindImage Forensics[J].Image and Vision Computing,2009,27(10):1497-1503. [24]ZHOU P,HAN X,MORARIU V I,et al.Learning Rich Fea-tures for Image Manipulation Detection[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).IEEE Comp Soc,2018:1053-1061. [25]HU X,ZHANG Z,JIANG Z,et al.SPAN:Spatial Pyramid Attention Network for Image Manipulation Localization[C]//Computer Vision - ECCV 2020.Springer International Publishing,2020:312-328. [26]YANG C,LI H,LIN F,et al.Constrained R-Cnn:A General Image Manipulation Detection Model[C]//IEEE International Conference on Multimedia and Expo(ICME).IEEE Comp Soc,2020:1-6. [27]LUO Z,SHAFAIT F,MIAN A.Localized forgery detection in hyperspectral document images[C]//2015 13th International Conference on Document Analysis and Recognition(ICDAR).IEEE Comp Soc,2015:496-500. [28]BIBI M,HAMID A,MOETESUM M,et al.Document Forgery Detection using Printer Source Identification-A Text-Independent Approach[C]//2019 International Conference on Document Analysis and Recognition Workshops(ICDARW):Vol.8.IEEE Comp Soc,2019:7-12. [29]LIANG W,DONG L,WANG R,et al.Robust Document Image Forgery Localization Against Image Blending[C]//2022 IEEE International Conference on Trust,Security and Privacy in Computing and Communications(TrustCom).IEEE Comp Soc,2022:810-817. [30]TAN M,LE Q.EfficientNet:Rethinking Model Scaling for Convolutional Neural Networks[C]//Proceedings of the 36th International Conference on Machine Learning.PMLR,2019:6105-6114. [31]SHI B,BAI X,YAO C.An End-to-End Trainable Neural Net-work for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304. [32]LYU P,LIAO M,YAO C,et al.Mask TextSpotter:An End-to-End Trainable Neural Network for Spotting Text with ArbitraryShapes[C]//Proceedings of the European Conference on Computer Vision(ECCV).Springer,2018:67-83. [33]QIAO Z,ZHOU Y,YANG D,et al.SEED:Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Comp Soc,2020:13528-13537. [34]LI M,LV T,CHEN J,et al.TrOCR:Transformer-Based Optical Character Recognition with Pre-trained Models[J].Proceedings of the AAAI Conference on Artificial Intelligence,2023,37(11):13094-13102. [35]CASTRO-BLEDA M J,ESPAÑA-BOQUERA S,PASTOR-PELLICER J,et al.The NoisyOffice Database:A Corpus To Train Supervised Machine Learning Filters For Image Proces-sing[J].The Computer Journal,2020,63(11):1658-1667. [36]HUAWEI CLOUD.Huawei cloud visual information extraction competition[EB/OL].https://competition.huaweicloud.com/information/1000041696/introduction. [37]BAO H,DONG L,PIAO S,et al.BEiT:BERT Pre-Training of Image Transformers[C]//International Conference on Learning Representations.OpenReview,2021. [38]LIU Z,HU H,LIN Y,et al.Swin Transformer V2:Scaling Up Capacity and Resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Comp Soc,2022:12009-12019. |
|