结合多粒度特征融合的自然场景文本检测方法

doi:10.11896/jsjkx.201000154

Abstract

Abstract: In natural scenes,text information usually has the characteristics of diversity and complexity.Due to the way of manua-lly designing features,traditional natural scene text detection methods lack robustness,and the existing text detection methods based on deep learning have the problem of losing important feature information in the process of extracting features in each layer of the network.This paper proposes a natural scene text detection method combined with multi-granularity feature fusion.The main contribution of this method is that by combining the features of different granularities in the general feature extraction network and adding the residual channel attention mechanism,the model can pay more attention to the target feature information and suppress useless information on the basis of fully learning the feature information of different granularities in the image,and this method improves the robustness and accuracy of the model.The experimental results show that,compared with other latest me-thods,the model has achieved 85.3% accuracy and 82.53% F-value on public datasets,and has better performance.

Key words: Convolutional neural network, Feature extraction, Multi-granularity information, Residual attention

CLC Number:

TP391

CHEN Zhuo, WANG Guo-yin, LIU Qun. Natural Scene Text Detection Algorithm Combining Multi-granularity Feature Fusion[J].Computer Science, 2021, 48(12): 243-248.

References

[1]CHO H,SUNG M,JUN B.Canny Text Detector:Fast and Robust Scene Text Localization Algorithm[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:3566-3573.
[2]NEUMANN L,MATAS J.A method for text localization and recognition in real-world images[C]//Asian Conference on Computer Vision.Berlin:Springer Press,2010:770-783.
[3]TIAN S X,PAN Y F,HUANG C,et al.Text flow:A unified text detection system in natural scene images[C]//Proceedings of the IEEE International Conference on Computer Vision.Santiago:IEEE Press,2015:4651-4659.
[4]WANG K,BELONGIE S.Word spotting in the wild[C]//European Conference on Computer Vision.Berlin:Springer Press,2010:591-604.
[5]TIAN Z,HUANG W L,HE T,et al.Detecting text in natural image with connectionist text proposal network[C]//European Conference on Computer Vision.Cham:Springer Press,2016:56-72.
[6]SHI B G,BAI X,BELONGIE S.Detecting oriented text in natural images by linking segments[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Hawaii:IEEE Press,2017:2550-2558.
[7]XU H L,SU F.A robust hierarchical detection method for scene text based on convolutional neural networks[C]//Proceedings of the 2015 IEEE International Conference on Multi-media and Expo.Turin:IEEE Press,2015:1-6.
[8]WANG Y X,XIE H T,ZHA Z J.ContourNet:Taking a Further Step Toward Accurate Arbitrary-shaped Scene Text Detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.IEEE Press,2020:11753-11762.
[9]YANG X,HE D F,ZHOU Z H,et al.Learning to Read Irregular Text with Attention Mechanisms[C]//International Joint Conference on Artificial Intelligence Pacific Rim International Conference on Artificial Intelligence.Melbourne:Morgan Kaufmann Press,2017:3.
[10]WANG W H,XIE E Z,LI X,et al.Shape robust text detection with progressive scale expansion network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.California:IEEE Press,2019:9336-9345.
[11]BAEK Y,LEE B,HAN D,et al.Character region awareness for text detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.California:IEEE Press,2019:9365-9374.
[12]CHEN L.Topological structure in visual perception[J]. Science,1982,218(4573):699-700.
[13]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[EB/OL].(2015) [2020-10-23].https://arxiv.org/pdf/1409.1556.pdf.
[14]HE K M,ZHANG X Y,REN S Q,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:770-778.
[15]ZHANG Y L,LI K P,LI K,et al.Image superresolution using very deep residual channel attention networks[C]//Proceedings of the European Conference on Computer Vision.Munich:Springer Press,2018:286-301.
[16]YAO C,BAI X,SANG N,et al.Scene text detection via holistic,multi-channel prediction[EB/OL].(2016) [2020-10-23].https://arxiv.org/pdf/1606.09002.pdf.
[17]ZHANG Z,ZHANG C Q,SHEN W,et al.Multi-oriented text detection with fully convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE Press,2016:4159-4167.
[18]ZHENG Y,LI Q,LIU J,et al.A cascaded method for text detection in natural scene images[J].Neurocomputing,2017,238:307-315.
[19]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single shot multibox detector[C]//European Conference on Computer Vision.Cham:Springer Press,2016:21-37.
[20]MA J Q,SHAO W Y,YE H,et al.Arbitrary-oriented scene text detection via rotation proposals[J].IEEE Transactions on Multimedia,2018,20(11):3111-3122.
[21]ZHOU X Y,YAO C,WEN H,et al.East:an efficient and accurate scene text detector[C]//Proceedings of the IEEE Confe-rence on Computer Vision and Pattern Recognition.Hawaii:IEEE Press,2017:5551-5560.
[22]REN S Q,HE K M,GIRSHICK R,et al.Faster R-CNN:to- wards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence Press,2016,39(6):1137-1149.
[23]SHI C Z,WANG C H,XIAO B H,et al.Scene text detection using graph model built upon maxially stable extremal regions[J].Pattern Recognition Letters,2013,34(2):107-116.
[24]WANG X B,SONG Y H,ZHANG Y L,et al.Natural scene text detection with multi-layer segmentation and higher order conditional random field based analysis[J].Pattern Recognition Letters,2015,60:41-47.
[25]JADERBERG M,VEDALDI A,ZISSERMAN A.Deep features or text spotting[C]//European Conference on Computer Vision.Zurich:Springer Press,2014:512-528.
[26]YIN X C,PEI W Y,ZHANG J,et al.Multi-orientation scene text detection with adaptive clustering[J].IEEE Transactions on Pattern Analysis and Machine Intelligence Press,2015,37(9):1930-1937.

Related Articles 15

[1]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4]	ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[5]	ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[6]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[7]	CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[8]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[9]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[10]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[11]	YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[12]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[13]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[14]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[15]	LIU Wei-ye, LU Hui-min, LI Yu-peng, MA Ning. Survey on Finger Vein Recognition Research [J]. Computer Science, 2022, 49(6A): 1-11.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Natural Scene Text Detection Algorithm Combining Multi-granularity Feature Fusion

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0