计算机科学 ›› 2022, Vol. 49 ›› Issue (2): 248-255.doi: 10.11896/jsjkx.201100072
邵海琳1, 季怡1, 刘纯平1, 徐云龙2
SHAO Hai-lin1, JI Yi1, LIU Chun-ping1, XU Yun-long2
摘要: 场景文本检测有助于机器理解图像内容,在智能交通、场景理解和智能导航等领域应用广泛。现有的场景文本检测算法未充分利用高层语义信息和空间信息,限制了模型对复杂背景像素的分类能力和对不同尺度的文本实例的检测和定位能力。为解决上述问题,提出了一种基于增强特征金字塔网络的场景文本检测算法。该算法包括比率不变特征增强(Ratio Invariant Feature Enhanced,RIFE)模块和重建空间分辨率(Rebuild Spatial Resolution,RSR)模块。RIFE模块作为残差分支,增强了网络的高层语义信息传递,提高了分类能力,降低了误报率和漏捡率。RSR模块重建多层特征分辨率,利用丰富的空间信息改进边界位置。实验结果表明,所提算法提升了在多方向文本数据集ICDAR2015、弯曲文本数据集Totaltext以及长文本数据集MSRA-TD500上的检测能力。
中图分类号:
[1]RAISI Z,NAIEL M A,FIEGUTH P,et al.Text Detection and Recognition in the Wild:A Review[J].arXiv:2006.04305,2020. [2]LIAO M,SHI B,BAI X,et al.Textboxes:A fast text detectorwith a single deep neural network[J].arXiv:1611.06779,2016. [3]LIU W,ANGUELOV D,ERHAN D,et al.Ssd:Single shotmultibox detector[C]//European Conference on Computer Vision.Cham:Springer,2016:21-37. [4]LIAO M,SHI B,BAI X.Textboxes++:A single-shot oriented scene text detector[J].IEEE Transactions on Image Processing,2018,27(8):3676-3690. [5]WANG W,XIE E,SONG X,et al.Efficient and accurate arbitrary-shaped text detection with pixel aggregation network[C]//Proceedings of the IEEE International Conference on Computer Vision.2019:8440-8449. [6]RICHARDSON E,AZAR Y,AVIOZ O,et al.It's All AboutThe Scale-Efficient Text Detection Using Adaptive Scaling[C]//The IEEE Winter Conference on Applications of Compu-ter Vision.2020:1844-1853. [7]LIAO M,WAN Z,YAO C,et al.Real-Time Scene Text Detection with Differentiable Binarization[C]//AAAI.2020:11474-11481. [8]DAI P,ZHANG H,CAO X.Deep multi-scale context aware feature aggregation for curved scene text detection[J].IEEE Transactions on Multimedia,2019,22(8):1969-1984. [9]CHEN M M,XU J H.Scene text detection model based on high resolution convolutional neural networks[J].Computer Applications and Software,2020,37(10):138-144. [10]LIN T Y,DOLLÑR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2117-2125. [11]KARATZAS D,GOMEZ-BIGORDA L,NICOLAOU A,et al.ICDAR 2015 competition on robust reading[C]//2015 13th International Conference on Document Analysis and Recognition (ICDAR).IEEE,2015:1156-1160. [12]CH'NG C K,CHAN C S.Total-text:A comprehensive dataset for scene text detection and recognition[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).IEEE,2017:935-942. [13]YAO C,BAI X,LIU W,et al.Detecting texts of arbitrary orientations in natural images[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:1083-1090. [14]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440. [15]LONG S,RUAN J,ZHANG W,et al.Textsnake:A flexible rep-resentation for detecting text of arbitrary shapes[C]//Procee-dings of the European Conference on Computer Vision (ECCV).2018:20-36. [16]ZHANG C,LIANG B,HUANG Z,et al.Look more than once:An accurate detector for text of arbitrary shapes[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:10552-10561. [17]WANG W,XIE E,LI X,et al.Shape robust text detection with progressive scale expansion network[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9336-9345. [18]BAEK Y,LEE B,HAN D,et al.Character region awareness for text detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:9365-9374. [19]ZHANG Z,ZHANG X,PENG C,et al.Exfuse:Enhancing feature fusion for semantic segmentation[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:269-284. [20]XIE E,ZANG Y,SHAO S,et al.Scene text detection with supervised pyramid context network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33:9038-9045. [21]GUO C,FAN B,ZHANG Q,et al.Augfpn:Improving multi-scale feature learning for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:12595-12604. [22]GUPTA A,VEDALDI A,ZISSERMAN A.Synthetic data for text localisation in natural images[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2315-2324. [23]NAYEF N,YIN F,BIZID I,et al.Icdar2017 robust reading challenge on multi-lingual scene text detection and script identification-rrc-mlt[C]//2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).IEEE,2017:1454-1459. [24]LIN T Y,MAIRE M,BELONGIE S,et al.Microsoft coco:Common objects in context[C]//European Conference on Computer Vision.Cham:Springer,2014:740-755. [25]TIAN Z,HUANG W,HE T,et al.Detecting text in naturalimage with connectionist text proposal network[C]//European Conference on Computer Vision.Cham:Springer,2016:56-72. [26]ZHOU X,YAO C,WEN H,et al.East:an efficient and accurate scene text detector[C]//Proceedings of the IEEE conference on Computer Vision and Pattern Recognition.2017:5551-5560. [27]MA J,SHAO W,YE H,et al.Arbitrary-oriented scene text de-tection via rotation proposals[J].IEEE Transactions on Multimedia,2018,20(11):3111-3122. [28]DENG D,LIU H,LI X,et al.Pixellink:Detecting scene text via instance segmentation[J].arXiv:1801.01315,2018. [29]LIAO M,ZHU Z,SHI B,et al.Rotation-sensitive regression for oriented scene text detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:5909-5918. [30]LYU P,YAO C,WU W,et al.Multi-oriented scene text detection via corner localization and region segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7553-7563. [31]LIU Y,WEN J.Complex scene text detection based on attention mechanism[J].Computer Science,2020,47(7):135-140. [32]CAI Y,WANG W,REN H,et al.SPN:short path network for scene text detection[J].Neural Computing and Applications,2019,32(1):1-13. [33]HE W,ZHANG X Y,YIN F,et al.Realtime multi-scale scenetext detection with scale-based region proposal network[J].Pattern Recognition,2020,98:107026. [34]QIN X,JIANG J,YUAN C A,et al.Arbitrary Shape Natural Scene Text Detection Method Based on Soft Attention Mechanism and Dilated Convolution[J].IEEE Access,2020,8:122685-122694. [35]ZHANG L,LIU Y,XIAO H,et al.Efficient Scene Text Detection with Textual Attention Tower[C]//ICASSP 2020-2020 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP).IEEE,2020:4272-4276. [36]SHI B,BAI X,BELONGIE S.Detecting oriented text in natural images by linking segments[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:2550-2558. [37]WANG X,JIANG Y,LUO Z,et al.Arbitrary shape scene text detection with adaptive text region representation[C]//Procee-dings of the IEEE Conference on Computer Vision and Pattern Recognition.2019:6449-6458. [38]XU Y,WANG Y,ZHOU W,et al.Textfield:Learning a deep direction field for irregular scene text detection[J].IEEE Transactions on Image Processing,2019,28(11):5566-5579. |
[1] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[2] | 郭亮, 杨兴耀, 于炯, 韩晨, 黄仲浩. 基于注意力机制和门控网络相结合的混合推荐系统 Hybrid Recommender System Based on Attention Mechanisms and Gating Network 计算机科学, 2022, 49(6): 158-164. https://doi.org/10.11896/jsjkx.210500013 |
[3] | 潘志豪, 曾碧, 廖文雄, 魏鹏飞, 文松. 基于交互注意力图卷积网络的方面情感分类 Interactive Attention Graph Convolutional Networks for Aspect-based Sentiment Classification 计算机科学, 2022, 49(3): 294-300. https://doi.org/10.11896/jsjkx.210100180 |
[4] | 吴兰, 王涵, 李斌全. 基于自监督任务最优选择的无监督域自适应方法 Unsupervised Domain Adaptive Method Based on Optimal Selection of Self-supervised Tasks 计算机科学, 2021, 48(6A): 357-363. https://doi.org/10.11896/jsjkx.201000030 |
[5] | 蒋宗礼, 李苗苗, 张津丽. 基于融合元路径图卷积的异质网络表示学习 Graph Convolution of Fusion Meta-path Based Heterogeneous Network Representation Learning 计算机科学, 2020, 47(7): 231-235. https://doi.org/10.11896/jsjkx.190600085 |
[6] | 杨力, 李欣宇, 石怀峰, 潘成胜. 空间信息网络任务智能识别方法 Task Intelligent Identification Method for Spatial Information Network 计算机科学, 2020, 47(4): 262-269. https://doi.org/10.11896/jsjkx.190300111 |
[7] | 郭崇岭, 赵野. 区块链技术在空间信息智能感知领域的应用综述 Research on Application of Blockchain Technology in Field of Spatial Information Intelligent Perception 计算机科学, 2020, 47(11A): 354-358. https://doi.org/10.11896/jsjkx.200400044 |
[8] | 霍丹, 张生杰, 万路军. 基于上下文的情感词向量混合模型 Context-based Emotional Word Vector Hybrid Model 计算机科学, 2020, 47(11A): 28-34. https://doi.org/10.11896/jsjkx.191100114 |
[9] | 李煌, 王晓莉, 项欣光. 基于文本三区域分割的场景文本检测方法 Scene Text Detection Based on Triple Segmentation 计算机科学, 2020, 47(11): 142-147. https://doi.org/10.11896/jsjkx.200800157 |
[10] | 杨柳, 王闯, 王俊毅. 一种空间信息网络体系架构的设计 System Design of Space Information Network Architecture 计算机科学, 2019, 46(6A): 309-311. |
[11] | 卢海川, 符海东, 刘宇. 基于CAN的地理语义数据存储与检索机制 Geo-semantic Data Storage and Retrieval Mechanism Based on CAN 计算机科学, 2019, 46(2): 171-177. https://doi.org/10.11896/j.issn.1002-137X.2019.02.027 |
[12] | 张天柱, 邹承明. 使用模糊聚类的胶囊网络在图像分类上的研究 Study on Image Classification of Capsule Network Using Fuzzy Clustering 计算机科学, 2019, 46(12): 279-285. https://doi.org/10.11896/jsjkx.190200315 |
[13] | 刘俊峰,李飞龙,杨杰. 基于LEO的骨干接入空间信息网络与用频策略研究 Researcn on Space Information Network Architecture Based on LEO Satellites for Backbone Access and Frequency Resolution Strategy 计算机科学, 2018, 45(6A): 337-341. |
[14] | 任守纲, 万升, 顾兴健, 王浩云, 袁培森, 徐焕良. 基于多尺度空谱鉴别特征的高光谱图像分类 Hyperspectral Image Classification Based on Multi-scale Discriminative Spatial-spectral Features 计算机科学, 2018, 45(12): 243-250. https://doi.org/10.11896/j.issn.1002-137X.2018.12.040 |
[15] | 廖勇,陈鸿宇,沈轩帆. 一种空间信息网络中改进的SCPS-SP Modified SCPS-SP for Space Information Network 计算机科学, 2017, 44(6): 155-160. https://doi.org/10.11896/j.issn.1002-137X.2017.06.026 |
|