计算机科学 ›› 2024, Vol. 51 ›› Issue (6): 256-263.doi: 10.11896/jsjkx.230500230
高楠, 张雷, 梁荣华, 陈朋, 付政
GAO Nan, ZHANG Lei, LIANG Ronghua, CHEN Peng, FU Zheng
摘要: 针对自然场景下图像文本复杂背景、尺度多变等造成的漏检、误检问题,提出了一种基于特征增强的场景文本检测算法。在特征金字塔融合阶段,提出了双域注意力特征融合模块(Dual-domain Attention Feature Fusion Module,D2AAFM)。该模块能够更好地融合不同语义和尺度的特征图信息,从而提高文本信息的表征能力。同时,考虑到网络深层特征图在上采样融合过程中出现语义信息损失的问题,提出了多尺度空间感知模块(Multi-scale Spatial Perception Module,MSPM),通过扩大感受野来获取更大感受野的上下文信息,增强深层特征图的文本语义信息特征,从而有效地减少文本漏检、误检。为了评估所提算法的有效性,在公开数据集ICDAR2015,CTW1500以及MSRA-TD500上进行实验,所提方法综合指标F值分别达到了82.8%,83.4%和85.3%。实验结果表明,该算法在不同数据集上都具有良好的检测能力。
中图分类号:
[1]QIN Y,ZHANG Z.Summary of Scene Text Detection and Re-cognition[C]//Proceedings of IEEE Conference on Industrial Electronics and Applications.Kristiansand:IEEE,2020:85-89. [2]CHEN M M,IBRAYI M,HAMDULL A.Research of SceneText Detection Algorithms[C]//Proceedings of International Conference on Intelligent Robotics and Control Engineering.Tianjin:IEEE,2022:108-112. [3]REN S,HE K,GIRSHICK R,et al.Faster R-CNN:TowardsReal-time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(6):1137-1149. [4]REDMON J,DIVVALA S,GIRSHICK R,et al.You Only Look Once:Unified,Real-time Object Detection[C]//Proceedings of International Conference on Computer Vision and Pattern Re-cognition.Las Vegas:IEEE,2016:779-788. [5]LIU W,ANGUELOV D,ERHAN D,et al.SSD:Single ShotMultiBox Detector[C]//Proceedings of International Confe-rence on Computer Vision and Pattern Recognition.Amsterdam:Springer,2016:21-37. [6]TIAN Z,HUANG W,HE T,et al.Detecting Text in NaturalImage with Connectionist Text Proposal Network[C]//Proceedings of European Conference on Computer Vision.Amsterdam:Springer,2016:56-72. [7]JIANG Y,ZHU X,WANG X,et al.R2CNN:Rotational Region CNN for Orientation Robust Scene Text Detection[C]//Proceedings of International Conference on Pattern Recognition.Vienna:IEEE,2018:3610-3615. [8]SHI B,BAI X,BELONGIE S.Detecting Oriented Text in Natural Images by Linking Segments[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition.HI:IEEE,2017:3482-3490. [9]ZHOU X,YAO C,WEN H,et al.EAST:An Efficient and Accurate Scene Text Detector[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition.HI:IEEE,2017:2642-2651. [10]ZHANG C,LIANG B,HUANG Z,et al.Look More ThanOnce:An Accurate Detector for Text of Arbitrary Shapes[C]//Proceedings of International Conference on Computer Vision and Pattern Recognition.Long Beach:IEEE,2019:10544-10553. [11]LIAO M,SHI B,BAI X,et al.Textboxes:A Fast Text Detector with A Single Deep Neural Network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.San Francisco:AAAI,2017:4161-4167. [12]LIAO M,SHI B,BAI X.Textboxes++:A Sin-gle-shot Oriented Scene Text Detector[J].IEEE Transactions on Image Processing,2018,27(8):3676-3690. [13]LONG J,SHELHAMER E,DARRELL T.Fully Convolutional Networks for Semantic Segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2015,39(4):640-651. [14]ZHANG Z,ZHANG C,SHEN W,et al.Multi-oriented text detection with fully convolutional networks[C]//Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition.Las Vegas:IEEE,2016:4159-4167. [15]LONG S,RUAN J,ZHANG W,et al.Textsnake:A FlexibleRepresentation for Detecting Text of Arbi-trary Shapes[C]//Proceedings of European Conference on Computer Vision.Munich:AAAI,2018:20-36. [16]HE T,HUANG W,QIAO Y,et al.Accurate Text Localization in Natural Image with Cascaded Convolutional Text Network[J].arXiv:1603.09423,2016. [17]XIE E,ZANG Y,SHAO S,et al.Scene Text Detection with Supervised Pyramid Context Network[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2019:9038-9045. [18]DENG D,LIU H,LI X,et al.Pixellink:Detecting Scene TextVia Instance Segmentation[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2018:20-36. [19]WANG Q,GAO J,ZHANG M,et al.SPCNet:Scale PositionCorrelation Network for End-to-End Visual Tracking[C]//Proceedings of International Conference on Pattern Recognition.Beijing:IEEE,2018:1803-1808. [20]WANG W,XIE E,LI X,et al.Shape Robust Text Detectionwith Progressive Scale Expansion Net-work[C]//Proceedings of International Conference on Computer Vision and Pattern Re-cognition.Long Beach:IEEE,2019:9328-9337. [21]WANG W,XIE E,SONG X,et al.Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network[C]//Proceedings of International Conference on Computer Vision.Seoul:IEEE,2019:8440-8449. [22]LIAO M,WAN Z,YAO C,et al.Real-time Scene Text Detection with Differentiable Binarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence.AAAI,2020:11474-11481. [23]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,40(4):834-848. [24]KARATZAS D,GOMEZ B L,NICOLAOU A,et al.Icdar 2015 Competition on Robust Reading[C]//Proceedings of International Conference on Document Analysis and Recognition.Tunis:IEEE,2015:1156-1160. [25]LIU Y,JIN L,ZHANG S.Detecting Curve Text in the Wild:New Dataset and New Solution[J].arXiv:1712.02170,2017. [26]YAO C,BAI X,LIU W Y,et al.Detecting Texts of ArbitraryOrientations in Natural Images[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.Providence:IEEE,2012:1083-1090. |
|