Computer Science ›› 2020, Vol. 47 ›› Issue (7): 135-140.doi: 10.11896/jsjkx.190600157

Complex Scene Text Detection Based on Attention Mechanism

LIU Yan, WEN Jing   

  1. School of Computer and Information Technology,Shanxi University,Taiyuan 030006,China
  • Received:2019-06-26 Online:2020-07-15 Published:2020-07-16
  • About author:LIU Yan,born in 1990,master.Her main research interests include compu-ter vision and so on.
    WEN Jing,born in 1982,Ph.D,associate professor,master tutor,is a member of China Computer Federation.Her main research interests include computer vision,image processing and pattern re-cognition.
  • Supported by:
    This work was supported by the Young Scientists Fund of the National Natural Science Foundation of China (61703252),1331 Engineering Project of Shanxi Province and Shanxi Province Applied Basic Research Programs (201701D121053)

Abstract: Most of the traditional text detection methods are developed in the bottom-up manner,which usually start with low-level semantic character or stroke detection,followed by non-text component filtering,text line construction,and text line validation.However,the modeling,scale,typesetting and surrounding environment of the characters in the complex scene change drastically,and the task of detecting text is carried up by human under variety of visual granularities.It’s difficult for these bottom-up traditional methods to maintain the text features under different resolution,due to their dependency on the low lever features.Recently,deep learning methods have been widely used in text detection in order to extract more features under different scale.However,in the existing methods,the key feature information is not emphasized during the feature extraction process of each layer,and will be lost in the layer-to-layer feature mapping process.Therefore,the missing information will also lead to a lot of false-alarm and leak detection,which causes much more time-consuming.This paper proposes a complex scene text detection method based on the attention mechanism.The main contribution of this method is to introduce a visual attention layer in VGG16,and use the attention mechanism to enhance the significant information in the global information in the network.Experiments show that in the Ubuntu environment with GPU,this method can ensure the integrity of the text area in the detection of complex scene text pictures,reduce the fragmentation of the detection area and can achieve up to 87% recall rate and 89% precision rate.

Key words: Text detection, Deep learning, Attention mechanism

CLC Number: 

  • TP391
