计算机科学 ›› 2024, Vol. 51 ›› Issue (1): 198-206.doi: 10.11896/jsjkx.230500232

• 计算机图形学&多媒体 • 上一篇    下一篇

基于生成对抗门控卷积网络的文档图像印章消除

伍贵宾1, 杨宗元1, 熊永平1, 张兴2, 王伟2   

  1. 1 北京邮电大学计算机学院(国家示范性软件学院) 北京100876
    2 华润数科人工智能实验室 广东 深圳518049
  • 收稿日期:2023-05-31 修回日期:2023-09-15 出版日期:2024-01-15 发布日期:2024-01-12
  • 通讯作者: 熊永平(ypxiong@bupt.edu.cn)
  • 作者简介:(wuguibin@bupt.edu.cn)
  • 基金资助:
    国家电网公司总部科技项目(5500-202216134A-1-1-ZN)

Seal Removal Based on Generative Adversarial Gated Convolutional Network

WU Guibin1, YANG Zongyuan1, XIONG Yongping1, ZHANG Xing2, WANG Wei2   

  1. 1 School of Computer Science(National Pilot Software Engineering School),Beijing University of Posts and Telecommunications,Beijing 100876,China
    2 Artificial Intelligence Lab of China Resources Digital Technology,Shenzhen,Guangdong 518049,China
  • Received:2023-05-31 Revised:2023-09-15 Online:2024-01-15 Published:2024-01-12
  • About author:WU Guibin,born in 1982,Ph.D.His main research interests include natural language processing,image processing and deep learning.
    XIONG Yongping,born in 1982,Ph.D,associate professor.His main research interests include document intelligence and OCR,visual IoT and machine vision.
  • Supported by:
    Research on Key Technology of Multi-spectral Optical Imaging inside GIS based on Fiber Bundle Image Transmission(5500-202216134A-1-1-ZN).

摘要: 发票和文档上的印章严重影响文字识别的准确率,因此印章消除技术在文档识别和文档增强的预处理过程中发挥着重要作用。然而,现有的阈值分割方法和基于深度学习的方法存在印章消除不全以及会修改背景像素等问题。文中提出了一个两阶段式印章消除网络SealErase。第一阶段是一个用于生成包含印章位置信息的二值化掩膜的U型分割网络,第二阶段是一个用于进行精细化印章消除的修复网络。由于目前缺乏公开的用于印章消除的成对数据集,现有的方法无法设计像素级的评价指标来衡量生成图像的质量。并且,利用配对的训练集训练神经网络可以有效提高网络的性能。为此,文中兼顾真实场景的泛化性以及对噪声的鲁棒性构建了一个包含8 000个样本的高仿真的印章消除数据集。其中的印章分为两种:真实文档图像中的印章和合成的印章。为了客观地评价SealErase的性能,文中设计了基于图像生成质量和被印章遮盖的字符识别准确率的综合评价指标用于评估SealErase网络的消除性能。在构建的印章消除数据集上对比了现有的印章消除模型,实验结果表明,SealErase网络在图像生成质量的评价指标中的峰值信噪比相比最先进的方法提升了26.79%,平均结构相似性指标提升了4.48%。经过SealErase网络进行印章消除后,被印章遮盖的字符识别准确率提高了38.86%。SealErase在真实场景下同样可以有效消除印章并保留被遮盖的文字。

关键词: 印章消除, 图像修复, 印章生成, 生成式对抗网络, 门控卷积, SealErase

Abstract: Seals on invoices and documents seriously affect the accuracy of text recognition,so seal elimination techniques play an important role in the pre-processing of document analysis,and document enhancement.However,threshold-based methods and deep learning-based methods suffer from incomplete seal elimination and modification of background pixels.Thus,this paper proposes a two-stage seal elimination network,SealErase.The first stage is a U-shaped segmentation network for generating bina-rized masks with seal position,and the second stage is an inpainting network for refined seal elimination.Due to the lack of available public paired datasets for seal elimination,existing methods cannot design pixel-level evaluation metrics to measure the quality of the generated images.Moreover,training the neural network using paired training sets can effectively improve the performance of the network.To this end,this paper constructs a high-simulated seal elimination dataset containing 8 000 samples,taking into account the generalisation to real scenes and the robustness to noise.The seals are divided into two types:seals in real document images and synthetic seals.In order to objectively evaluate the performance of SealErase,it devises a comprehensive evaluation metric based on the image generation quality and the recognition accuracy of characters obscured by seals to evaluate the elimination performance of the SealErase network.The existing seal elimination methods are compared on the seal elimination dataset,and the experimental results show that the SealErase network improve the peak signal to noise ratio by 26.79% and the mean structural similarity by 4.48% in the evaluation metric of image generation quality compared to the state-of-the-art methods.After seal elimination by SealErase network,the accuracy of recognition of characters obscured by seals is improved by 38.86%.Experimental results show that SealErase is equally effective in eliminating seals and preserving the obscured characters in real scenes.

Key words: Seal removal, Image inpainting, Seal synthesis, Generative adversarial networks, Gated Convolutions, SealErase

中图分类号: 

  • TP391
[1]SINGH A,BACCHUWAR K,BHASIN A.A survey of OCRapplications[J].International Journal of Machine Learning and Computing,2012,2(3):314-319.
[2]ZHAO Y T,LI Z M,WANG H J,et al.Image preprocessed study of the seal imprint verification[J].Chinese Journal of Scientific Instrument,2004,25(z3):401-403,410.
[3]JI J J,LOU Z.Filtering of color seal on bank notes based on re-segmentation[J].Modern Electronics Technique,2014,37(22):5-9.
[4]LI X L,ZOU C M,YANG G T,et al.SealGAN:Research on the seal elimination based on generative adversarial network[J].Acta Automatica Sinica,2021,47(11):2614-2622.
[5]ZHU J Y,PARK T,ISOLA P,et al.Unpaired image-to-imagetranslation using cycle-consistent adversarial networks[C]//Proceedings of the IEEE International Conference on Computer Vision.IEEE,2017:2223-2232.
[6]WANG J,MIAO J,QING L Y,et al.Seal removal based on Pix2Pix network[J].Journal of Beijing Information Science & Technology University,2021,36(4):39-43.
[7]ISOLA P,ZHU J Y,ZHOU T,et al.Image-to-image translation with conditional adversarial networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2017:1125-1134.
[8]WANG Z,BOVIK A C,SHEIKH H R,et al.Image quality assessment:from error visibility to structural similarity[J].IEEE Transactions on Image Processing,2004,13(4):600-612.
[9]ANVARI Z,ATHITSOS V.A Survey on Deep learning based Document Image Enhancement[EB/OL].https://arxiv.org/pdf/2112.02719.pdf.
[10]HRADIS M,KOTERA J,ZEMCIK P,et al.Convolutional neural networks for direct text deblurring[C]//Proceedings of BMVC.2015.
[11]GANGEH M J,TIYYAGURA S R,DASARATHA S V,et al.Document enhancement system using auto-encoders[C]//Workshop on Document Intelligence at NeurIPS 2019.2019.
[12]MAO X,SHEN C,YANG Y B.Image restoration using verydeep convolutional encoder-decoder networks with symmetric skip connections[J].arXiv:1603.09056,2016.
[13]SOUIBGUI M A,KESSENTINI Y.DE-GAN:a conditional ge-nerative adversarial network for document enhancement[J].ar-Xiv:2010.08764,2020.
[14]MIRZA M,OSINDERO S.Conditional generative adversarialnets[EB/OL].https://arxiv.org/pdf/1411.1784.pdf.
[15]JEMNI S K,SOUIBGUI M A,KESSENTINI Y,et al.Enhance to read better:A Multi-Task Adversarial Network for Handwritten Document Image Enhancement[J].Pattern Recognition,2022,123:108370-108383.
[16]ZHAO L L,SHEN L,HONG R C.Survey on image inpainting research progress[J].Computer Science,2021,48(3):14-26.
[17]ELHARROUSS O,ALMAADEED N,AL-MAADEED S,et al.Image inpainting:A review[J].Neural Processing Letters,2020,51(2):2007-2028.
[18]LIU G,REDA F A,SHIH K J,et al.Image inpainting for irre-gular holes using partial convolutions[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:85-100.
[19]YU J,LIN Z,YANG J,et al.Free-form image inpainting with gated convolution[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.IEEE,2019:4471-4480.
[20]RADFORD A,METZ L,CHINTALA S.Unsupervised repre-sentation learning with deep convolutional generative adversarial networks[EB/OL].https://arxiv.org/pdf/1511.06434.pdf.
[21]PATHAK D,KRAHENBUHL P,DONAHUE J,et al.Context encoders:Feature learning by inpainting[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2016:2536-2544.
[22]YU J,LIN Z,YANG J,et al.Generative image inpainting with contextual attention[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2018:5505-5514.
[23]YU Y,ZHAN F,LU S,et al.WaveFill:A Wavelet-based Gene-ration Network for Image Inpainting[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.IEEE,2021:14114-14123.
[24]NAKAMURA T,ZHU A,YANAI K,et al.Scene text eraser[C]//2017 14th IAPR International Conference on Document Analysis and Recognition(ICDAR).IEEE,2017:832-837.
[25]ZHANG S,LIU Y,JIN L,et al.2019.Ensnet:Ensconce text in the wild[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019:801-808.
[26]LIU C,LIU Y,JIN L,et al.EraseNet:End-to-end text removal in the wild[J].IEEE Transactions on Image Processing,2020,29:8760-8775.
[27]MILLETARI F,NAVAB N,AHMADI S A.V-net:Fully convolutional neural networks for volumetric medical image segmentation[C]//2016 fourth International Conference on 3D vision(3DV).IEEE,2016:565-571.
[28]MIYATO T,KATAOKA T,KOYAMA M,et al.Spectral normalization for generative adversarial networks[EB/OL].https://arxiv.org/pdf/1802.05957.pdf.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!