基于改进区域候选网络的场景文本检测

doi:10.11896/jsjkx.211000191

Abstract

Abstract: Scene text images have very complex and changeable features.Using region proposal network(RPN) to extract text rectangle position candidate boxes is an indispensable step,which can greatly improve the accuracy of text detection.However,recent studies show that the methods of regressing the center point,width and height of the text rectangular candidate boxes by minimizing the smooth L₁ loss function would easily cause problems such as missing boundary information and inaccurate regression.Therefore,this paper proposes a scene text detection model based on improved region proposal network.First,the backbone network composed of the residual network and the feature pyramid network is used to generate a shared feature map.Then,an improved regression method and vertex-based loss function(Vertex-IOU) are used to generate a series of text rectangular candidate boxes on the shared feature map.Finally,ROI Align is used to convert these candidate boxes into fixed-size feature maps for bounding box regression in the fully connected layer.Through comparative experiments on ICDAR2015 dataset,the results show that the test effect is improved compared with other models,which proves the effectiveness of our model.

Key words: Keywords deep learning, Scene text detection, Region proposal network, Regression method, Loss function

CLC Number:

TP391

LI Junlin, OUYANG Zhi, DU Nisuo. Scene Text Detection with Improved Region Proposal Network[J].Computer Science, 2023, 50(2): 201-208.

References

[1]WANG R M,SANG N,DING D,et al.Text Detection in Natural Scene Image:A Survey [J].Acta Automatica Sinaca,2018,44(12):2113-2141.
[2]MIAO Y Q,LIU S Q,ZHANG W Z,et al.Chinese text detection algorithm in natural sceneimages[J].Computer Engineering and Design,2018,39(3):804-807,818.
[3]JIANG W,ZHANG C S,YIN X C.Deep Learning Based Scene Text Detection:ASurvey[J].Acta Electronica Sinica,2019,47(5):1152-1161.
[4]SIMONYAN K,ZISSERMAN A.Very DeepConvolutional Networks for Large-Scale Image Recognition[C]//Proceedings of the International Conference on Learning Representations.San Diego:2015.
[5]HE K M,ZHANG X Y,REN S Q,et al.Deep Residual Learning for Image Recognition[C]//Proceedings of the 2016 IEEE Conference onComputer Vision and Pattern Recognition.Las Vegas:IEEE,2016:770-778.
[6]LONG J,SHELHAMER E,DARRELL T.Fully ConvolutionalNetworks for Semantic Segmentation[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition.Boston,Massachusetts:IEEE,2015:3431-3440.
[7]XUAN D D,WANG J,WANG Z.Salient target detection based on high-level priori semantics[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2020,32(2):304-312.
[8]ROSS G.Fast R-CNN[C]//Proceedings of the 2015 IEEE International Conference on ComputerVision.Santiago,Chile:IEEE,2015:1440-1448.
[9]YU J H,JIANG Y N,WANG Z Y,et al.UnitBox:An Advanced Object Detection Network[C]//Proceedings of the 2016 ACM Multimedia Conference.Amsterdam:2016:516-520.
[10]REZATOFIGHI H,TSOI N,GWAK J,et al.Generalized intersection over union:A metric and a loss for bounding box regression[C]//Proceedings of the 2019 IEEE Conference on Compu-ter Vision and Pattern Recognition.Long Beach,CA:IEEE,2019:658-666.
[11]TIAN Z,HUANG W,HE T,et al.Detecting Text in Natural Image with Connectionist Text Proposal Network[C]//Proceedings of the 14th European Conference on Computer Vision.Amsterdam,2016:56-72.
[12]MA J Q,SHAO W Y,YE H,et al.Arbitrary-Oriented Scene Text Detection via RotationProposals[J].arXiv:1703.01086,2017.
[13]ZHANG C Q,LIANG B R,HUANG Z M,et al.Look More Than Once:An Accurate Detector forText of Arbitrary Shapes[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA:IEEE,2019:10552-10651.
[14]ZHOU X,YAO C,WEN H,et al.EAST:An Efficient and Accurate Scene Text Detector[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii:IEEE,2017:2642-2651.
[15]BEAK Y,LEE B,HAN D,et al.Character Region Awareness for Text Detection[C]//Proceedings of the 2019 IEEE Confe-rence on Computer Vision and Pattern Recognition.Long Beach,CA:IEEE,2019:9365-9374.
[16]LIU Y L,ZHANG S,JUN L W,et al.Omnidirectional scene text detection with sequential-free box discretization[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.Macao:2019:3052-3058.
[17]HE K M,GEORGIA G,PIOTR D,et al.Mask R-CNN[C]//Proceedings of the 2017 IEEE International Conference on Computer Vision.Venice:IEEE,2017:2980-2988.
[18]HUANG D,CHEN Z,FENG X.Object detection method based on graph convolution net under limitedsamples[J].Journal of Chongqing University of Technology(Natural Science),2022,36(6):172-180.
[19]ANKUSH G,ANDREA V,ANDREW Z.Synthetic Data forText Localisation in Natural Images[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:2315-2324.
[20]NIBAL N,FEI Y,IMEN B,et al.ICDAR2017 Robust Reading Challenge on Multi-Lingual Scene Text Detection and Script Identification-RRC-MLT[C]//Proceedings of the 14th IAPR International Conference on Document Analysis and Recognition.Kyoto:2017:1454-1459.
[21]LIU Y L,JIN L W,ZHANG S T,et al.Detecting Curve Text in the Wild:New Dataset and NewSolution[J].arXiv:1712.02170,2017.
[22]LYU P Y,YAO C,WU W H,et al.Multi-Oriented Scene Text Detection via Corner Localization and Region Segmentation[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition,Salt Lake City.Utah:IEEE,2018:7553-7563.
[23]DENG D,LIU H F,LI X L,et al.PixelLink:Detecting Scene Text via Instance Segmentation[C]//Proceedings of the 32th AAAI Conference on Artificial Intelligence.New Orleans,Louisiana:2017:6773-6780.
[24]WANG W H,XIE E Z,SONG X G,et al.Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network[C]//Proceedings of the 2019 IEEE International Confe-rence on Computer Vision.Seoul:IEEE,2019:8439-8448.
[25]FENG W,HE W H,YIN F,et al.TextDragon:An End-to-End Framework for Arbitrary Shaped Text Spotting[C]//Procee-dings of the 2019 IEEE International Conference on Computer Vision.Seoul:IEEE,2019:9075-9084.
[26]XU Y C,WANG Y K,ZHOU W,et al.TextField:Learning a Deep Direction Field for Irregular Scene Text Detection[J].ar-Xiv:1812.01393,2018.
[27]WANG W H,XIE E Z,LI X,et al.Shape Robust Text Detection With Progressive Scale Expansion Network[C]//Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition.Long Beach,CA:IEEE,2019:9336- 9345.
[28]RICHARDSON E,AZAR Y,AVIOZ O,et al.It's All About The Scale-Efficient Text Detection Using Adaptive Scaling[C]//Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision.Aspen,Colorado:IEEE,2020:1844- 1853.
[29]ZHANG L,LIU Y,XIAO H,et al.Efficient Scene Text Detection with Textual Attention Tower[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing(ICASSP).Barcelona:IEEE,2020:4272-4276.
[30]LIAO M,WAN Z,YAO C,et al.Real-Time Scene Text Detection with Differentiable Binarization[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence.New York:2020:11474-11481.
[31]SHAO H L,JI Y,LIU C P,et al.Scene Text Detection Algorithm Based on Enhanced Feature Pyramid Network[J].Computer Science,2022,49(2):248-255.
[32]XUE C H,LU S J,ZHANG W.MSR:Multi-Scale Shape Regression for Scene Text Detection[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence.Macao:2019:989-995.
[33]SHI B G,BAI X,SERGE J B.Detecting Oriented Text in Natural Images by Linking Segments[C]//Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition.Honolulu,Hawaii:IEEE,2017:3482-3490.
[34]LIAO M H,ZHU Z,SHI B G,et al.Rotation-Sensitive Regression for Oriented Scene Text Detection[C]//Proceedings of the 2018 IEEE Conference on Computer Vision and Pattern Recognition.Salt Lake City,Utah:IEEE,2018:5905-5918.
[35]WANG Y X,XIE H T,ZHA Z J,et al.ContourNet:Taking a Further Step toward Accurate Arbitrary- shaped Scene Text Detection[C]//Proceedings of the 2020 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2020:11750-11759.
[36]XIE B H,QIN Y L,ZHANG Y J.Scene Text Detection Based on Learning Active Center ContourModel[J].Computer Engineering,2022,48(3):244-252,262.

Related Articles 15

[1]	WANG Xiaofei, FAN Xueqiang, LI Zhangwei. Improving RNA Base Interactions Prediction Based on Transfer Learning and Multi-view Feature Fusion [J]. Computer Science, 2023, 50(3): 164-172.
[2]	MENG Yue-bo, MU Si-rong, LIU Guang-hui, XU Sheng-jun, HAN Jiu-qiang. Person Re-identification Method Based on GoogLeNet-GMP Based on Vector Attention Mechanism [J]. Computer Science, 2022, 49(7): 142-147.
[3]	GAO Rong-hua, BAI Qiang, WANG Rong, WU Hua-rui, SUN Xiang. Multi-tree Network Multi-crop Early Disease Recognition Method Based on Improved Attention Mechanism [J]. Computer Science, 2022, 49(6A): 363-369.
[4]	SHAO Hai-lin, JI Yi, LIU Chun-ping, XU Yun-long. Scene Text Detection Algorithm Based on Enhanced Feature Pyramid Network [J]. Computer Science, 2022, 49(2): 248-255.
[5]	RAN Yu, ZHANG Li. R-YOLOv5:Auto-cutting,Rotated Text Detection Model [J]. Computer Science, 2022, 49(11A): 210900185-6.
[6]	HUANG Ying-qi, CHEN Hong-mei. Cost-sensitive Convolutional Neural Network Based Hybrid Method for Imbalanced Data Classification [J]. Computer Science, 2021, 48(9): 77-85.
[7]	ZHANG Xiao-yu, WANG Bin, AN Wei-chao, YAN Ting, XIANG Jie. Glioma Segmentation Network Based on 3D U-Net＋+ with Fusion Loss Function [J]. Computer Science, 2021, 48(9): 187-193.
[8]	FENG Jiao, LU Chang-yu. Cross Media Retrieval Method Based on Residual Attention Network [J]. Computer Science, 2021, 48(6A): 122-126.
[9]	SHI Xian-rang, SONG Ting-lun, TANG De-zhi, DAI Zhen-yong. Novel Deep Learning Algorithm for Monocular Vision:H_SFPN [J]. Computer Science, 2021, 48(4): 130-137.
[10]	QU Hao, CUI Chao-ran, WANG Xiao-xiao, SU Ya-xi, HAN Xiao-hui, YIN Yi-long. Hierarchical Learning on Unbalanced Data for Predicting Cause of Action [J]. Computer Science, 2021, 48(12): 337-342.
[11]	MU Feng-jun, QIU Jing, CHEN Lu-feng, HUANG Rui, ZHOU Lin, YU Gong-jing. Optimization Method for Inter-frame Stability of Object Pose Estimation for Human-Machine Collaboration [J]. Computer Science, 2021, 48(11): 226-233.
[12]	MENG Li-sha, REN Kun, FAN Chun-qi, HUANG Long. Dense Convolution Generative Adversarial Networks Based Image Inpainting [J]. Computer Science, 2020, 47(8): 202-207.
[13]	JING Yu, QI Rui-hua, LIU Jian-xin, LIU Zhao-xia. Gesture Recognition Algorithm Based on Improved Multiscale Deep Convolutional Neural Network [J]. Computer Science, 2020, 47(6): 180-183.
[14]	WANG Li-hua,DU Ming-hui,LIANG Ya-ling. Classification Net Based on Angular Feature [J]. Computer Science, 2020, 47(2): 83-87.
[15]	WANG Li-xing, CAO Fu-yuan. Huber Loss Based Nonnegative Matrix Factorization Algorithm [J]. Computer Science, 2020, 47(11): 80-87.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Scene Text Detection with Improved Region Proposal Network

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0