Computer Science ›› 2022, Vol. 49 ›› Issue (2): 248-255.doi: 10.11896/jsjkx.201100072

• Artificial Intelligence • Previous Articles     Next Articles

Scene Text Detection Algorithm Based on Enhanced Feature Pyramid Network

SHAO Hai-lin1, JI Yi1, LIU Chun-ping1, XU Yun-long2   

  1. 1 School of Computer Science and Technology,Soochow University,Suzhou,Jiangsu 215006,China
    2 Applied Technology College of Soochow University,Suzhou,Jiangsu 215300,China
  • Received:2020-11-09 Revised:2021-05-02 Online:2022-02-15 Published:2022-02-23
  • About author:SHAO Hai-lin,born in 1995,postgra-duate,is a member of China Computer Federation.Her main research interests include scene text detection and so on.
    LIU Chun-ping,born in 1971,Ph.D,professor,Ph.D supervisor.Her main research interests include computer vision,image analysis and recognition,in particular in domains of visual saliency detection,objection and scene understanding.
  • Supported by:
    National Natural Science Foundation of China(61972059,61773272,61602332),Natural Science Foundation of Jiangsu Higher Education Institutions of China(19KJA230001),Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education,Jilin University(93K172016K08) and Priority Academic Program Development of Jiangsu Higher Education Institutions.

Abstract: Scene text detection helps machines understand image content,and is widely used in the fields such as intelligent transportation,scene understanding,and intelligent navigation.Existing scene text detection algorithms do not make full use of high-level semantic information and spatial information,which limits the model's ability to classify complex background pixels and the ability to detect and locate text instances of different scales.In order to solve the above problems,a scene text detection algorithm based on enhanced feature pyramid network is proposed.The algorithm includes a RIFE (ratio invariant feature enhanced) mo-dule and a RSR (rebuild spatial resolution) module.As the residual branch,the RIFE module enhances the high-level semantic information transmission of the network,improves the classification ability,and reduces the false positive rate and the false negative rate.The RSR module reconstructs multi-layer feature resolution and uses rich spatial information to improve the boundary location.Experimental results show that the proposed algorithm improves the detection capabilities on the multi-directional text dataset ICDAR2015,the curved text dataset Totaltext,and the long text dataset MSRA-TD500.

Key words: Boundary location, Feature pyramid network, Scene text detection, Semantic information, Spatial information

CLC Number: 

  • TP391
[1]RAISI Z,NAIEL M A,FIEGUTH P,et al.Text Detection and Recognition in the Wild:A Review[J].arXiv:2006.04305,2020.
[2]LIAO M,SHI B,BAI X,et al.Textboxes:A fast text detectorwith a single deep neural network[J].arXiv:1611.06779,2016.
[3]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37.
[4]LIAO M,SHI B,BAI X.Textboxes++:A single-shot oriented scene text detector[J].IEEE Transactions on Image Processing,2018,27(8):3676-3690.
[5]WANG W,XIE E,SONG X,et al.Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:8440-8449.
[6]RICHARDSON E,AZAR Y,AVIOZ O,et al.It's All AboutThe Scale-Efficient Text Detection Using Adaptive Scaling[C]//The IEEE Winter Conference on Applications of Compu-ter Vision.2020:1844-1853.
[7]LIAO M,WAN Z,YAO C,et al.Real-Time Scene Text Detection with Differentiable Binarization[C]//AAAI.2020:11474-11481.
[8]DAI P,ZHANG H,CAO X.Deep multi-scale context aware feature aggregation for curved scene text detection[J].IEEE Transactions on Multimedia,2019,22(8):1969-1984.
[9]CHEN M M,XU J H.Scene text detection model based on high resolution convolutional neural networks[J].Computer Applications and Software,2020,37(10):138-144.
[10]LIN T Y,DOLLÑR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125.
[11]KARATZAS D,GOMEZ-BIGORDA L,NICOLAOU A,et al.ICDAR 2015 competition on robust reading[C]//2015 13th International Conference on Document Analysis and Recognition (ICDAR).IEEE,2015:1156-1160.
[12]CH'NG C K,CHAN C S.Total-text:A comprehensive dataset for scene text detection and recognition[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).IEEE,2017:935-942.
[13]YAO C,BAI X,LIU W,et al.Detecting texts of arbitrary orientations in natural images[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:1083-1090.
[14]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[15]LONG S,RUAN J,ZHANG W,et al.Textsnake:A flexible rep-resentation for detecting text of arbitrary shapes[C]//Procee-dings of the European Conference on Computer Vision (ECCV).2018:20-36.
[16]ZHANG C,LIANG B,HUANG Z,et al.Look more than once:An accurate detector for text of arbitrary shapes[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:10552-10561.
[17]WANG W,XIE E,LI X,et al.Shape robust text detection with progressive scale expansion network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9336-9345.
[18]BAEK Y,LEE B,HAN D,et al.Character region awareness for text detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9365-9374.
[19]ZHANG Z,ZHANG X,PENG C,et al.Exfuse:Enhancing feature fusion for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:269-284.
[20]XIE E,ZANG Y,SHAO S,et al.Scene text detection with supervised pyramid context network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33:9038-9045.
[21]GUO C,FAN B,ZHANG Q,et al.Augfpn:Improving multi-scale feature learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12595-12604.
[22]GUPTA A,VEDALDI A,ZISSERMAN A.Synthetic data for text localisation in natural images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2315-2324.
[23]NAYEF N,YIN F,BIZID I,et al.Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).IEEE,2017:1454-1459.
[24]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755.
[25]TIAN Z,HUANG W,HE T,et al.Detecting text in naturalimage with connectionist text proposal network[C]//European Conference on Computer Vision.Cham:Springer,2016:56-72.
[26]ZHOU X,YAO C,WEN H,et al.East:an efficient and accurate scene text detector[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.2017:5551-5560.
[27]MA J,SHAO W,YE H,et al.Arbitrary-oriented scene text de-tection via rotation proposals[J].IEEE Transactions on Multimedia,2018,20(11):3111-3122.
[28]DENG D,LIU H,LI X,et al.Pixellink:Detecting scene text via instance segmentation[J].arXiv:1801.01315,2018.
[29]LIAO M,ZHU Z,SHI B,et al.Rotation-sensitive regression for oriented scene text detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5909-5918.
[30]LYU P,YAO C,WU W,et al.Multi-oriented scene text detection via corner localization and region segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7553-7563.
[31]LIU Y,WEN J.Complex scene text detection based on attention mechanism[J].Computer Science,2020,47(7):135-140.
[32]CAI Y,WANG W,REN H,et al.SPN:short path network for scene text detection[J].Neural Computing and Applications,2019,32(1):1-13.
[33]HE W,ZHANG X Y,YIN F,et al.Realtime multi-scale scenetext detection with scale-based region proposal network[J].Pattern Recognition,2020,98:107026.
[34]QIN X,JIANG J,YUAN C A,et al.Arbitrary Shape Natural Scene Text Detection Method Based on Soft Attention Mechanism and Dilated Convolution[J].IEEE Access,2020,8:122685-122694.
[35]ZHANG L,LIU Y,XIAO H,et al.Efficient Scene Text Detection with Textual Attention Tower[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2020:4272-4276.
[36]SHI B,BAI X,BELONGIE S.Detecting oriented text in natural images by linking segments[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2550-2558.
[37]WANG X,JIANG Y,LUO Z,et al.Arbitrary shape scene text detection with adaptive text region representation[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:6449-6458.
[38]XU Y,WANG Y,ZHOU W,et al.Textfield:Learning a deep direction field for irregular scene text detection[J].IEEE Transactions on Image Processing,2019,28(11):5566-5579.
[1] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[2] GUO Liang, YANG Xing-yao, YU Jiong, HAN Chen, HUANG Zhong-hao. Hybrid Recommender System Based on Attention Mechanisms and Gating Network [J]. Computer Science, 2022, 49(6): 158-164.
[3] PAN Zhi-hao, ZENG Bi, LIAO Wen-xiong, WEI Peng-fei, WEN Song. Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification [J]. Computer Science, 2022, 49(3): 294-300.
[4] WU Lan, WANG Han, LI Bin-quan. Unsupervised Domain Adaptive Method Based on Optimal Selection of Self-supervised Tasks [J]. Computer Science, 2021, 48(6A): 357-363.
[5] YANG Li, LI Xin-yu, SHI Huai-feng, PAN Cheng-sheng. Task Intelligent Identification Method for Spatial Information Network [J]. Computer Science, 2020, 47(4): 262-269.
[6] HUO Dan, ZHANG Sheng-jie, WAN Lu-jun. Context-based Emotional Word Vector Hybrid Model [J]. Computer Science, 2020, 47(11A): 28-34.
[7] GUO Chong-ling, ZHAO Ye. Research on Application of Blockchain Technology in Field of Spatial Information Intelligent Perception [J]. Computer Science, 2020, 47(11A): 354-358.
[8] LI Huang, WANG Xiao-li, XIANG Xin-guang. Scene Text Detection Based on Triple Segmentation [J]. Computer Science, 2020, 47(11): 142-147.
[9] LU Hai-chuan, FU Hai-dong, LIU Yu. Geo-semantic Data Storage and Retrieval Mechanism Based on CAN [J]. Computer Science, 2019, 46(2): 171-177.
[10] ZHANG Tian-zhu, ZOU Cheng-ming. Study on Image Classification of Capsule Network Using Fuzzy Clustering [J]. Computer Science, 2019, 46(12): 279-285.
[11] REN Shou-gang, WAN Sheng, GU Xing-jian, WANG Hao-yun, YUAN Pei-sen, XU Huan-liang. Hyperspectral Image Classification Based on Multi-scale Discriminative Spatial-spectral Features [J]. Computer Science, 2018, 45(12): 243-250.
[12] ZHANG Qun, WANG Hong-jun and WANG Lun-wen. Short Text Clustering Algorithm Combined with Context Semantic Information [J]. Computer Science, 2016, 43(Z11): 443-446.
[13] WU Zhi-peng HUANG Zhi-qiu WANG Shan-shan CAO De-jian. Research on Framework of Safety Verification Based on Fault-extended SysML Activity Diagram [J]. Computer Science, 2015, 42(7): 222-228.
[14] LIU Zhe,SONG Yu-qing and BAO Xiang. Medical Image Segmentation Based on Non-parametric B-spline Density Model with Spatial Information [J]. Computer Science, 2014, 41(12): 293-296.
[15] YOU Hong-tao,ZHANG Yan-yuan,LIN Yi and LIU Sheng. Based on the Semantic Information of the Stored Energy Efficiency Research [J]. Computer Science, 2013, 40(Z6): 112-114.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!