计算机科学 ›› 2025, Vol. 52 ›› Issue (6): 179-186.doi: 10.11896/jsjkx.240500064
王腾1, 冼允廷1, 徐浩1, 谢宋褀1, 邹全义2
WANG Teng1, XIAN Yunting1, XU Hao1, XIE Songqi1, ZOU Quanyi2
摘要: 船舶身份识别在水上目标监管中具有重要意义和广泛应用。船名是船舶身份识别的重要组成部分,准确识别船名可以弥补传统AIS身份识别方法的不足,提高船舶身份识别的准确率。与传统的中文文本识别相比,水上环境复杂,光照变化大,船体受腐蚀严重,船名字体不规范,导致船名图像存在清晰度低、文字残缺、字体样式不一致等问题,进而使船名识别困难且准确率低。文中设计了一种基于多层次嵌套Transformer的轻量级识别网络,以解决船名识别中存在的问题。首先,通过空间变换网络对输入图片进行处理,纠正船名倾斜的情况;然后利用嵌套Transformer有效提取图像的多粒度特征;最后对文字和部首进行不同尺度的识别。实验结果显示,相比其他文字识别方法,所提算法在船名识别中表现优异;在CSLD数据集上,准确率达到了92.68%;在SCSLD数据集上,准确率达到了94.50%;在DCSLD数据集上,准确率达到了66.34%;同时,该方法具有低参数量和高帧率的特点。
中图分类号:
[1]JIN L W,YIN J X,GAO X,et al.Study of Several directional feature extraction methods with local elastic meshing technology for HCCR[C]//Proceedings of the 6th International Conference for Young Computer Scientist.Hong Kong:International Academic Publishers,World Publishing Corporation,2001:232-236. [2]SU Y M,WANG J F.A novel stroke extraction method for Chinese characters using Gabor filters[J].Pattern Recognition,2003,36(3):635-647. [3]CHANG F.Techniques for Solving the Large-Scale Classification Problem in Chinese Handwriting Recognition[M].Berlin:Springer,2008:161-169. [4]YU H,CHEN J,LI B,et al.Benchmarking Chinese Text Recognition:Datasets,Baselines,and an Empirical Study[J].arXiv:2112.15093,2021. [5]SHI B,BAI X,YAO C.An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304. [6]SHI B,YANG M,WANG X,et al.ASTER:An AttentionalScene Text Recognizer with Flexible Rectification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(9):2035-2048. [7]LU N,YU W,QI X,et al.MASTER:Multi-aspect non-localnetwork for scene text recognition[J].Pattern Recognition,2021,117:107980. [8]FANG S,XIE H,WANG Y,et al.Read like humans:Autonomous,bidirectional and iterative language modeling for scene text recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Computer Society,2021:7098-7107. [9]WANG W,ZHANG J,DU J,et al.DenseRAN for Offline Handwritten Chinese Character Recognition[C]//Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition(ICFHR).New York:IEEE,2018:104-109. [10]WANG T,XIE Z,LI Z,et al.Radical aggregation network for few-shot offline handwritten Chinese character recognition[J].Pattern Recognition Letters,2019,125:821-827. [11]DENG X,HUANG Z,MA K,et al.RRecT:Chinese Text Recognition with Radical-Enhanced Recognition Transformer[C]//Proceedings of the International Conference on Artificial Neural Networks and Machine Learning - ICANN 2023.Berlin:Springer,2023:509-521. [12]CAO Z,LU J,CUI S,et al.Zero-shot Handwritten ChineseCharacter Recognition with hierarchical decomposition embedding[J].Pattern Recognition,2020,107:107488. [13]CHEN J,LI B,XUE X.Zero-shot Chinese character recognition with stroke-level decomposition[J].arXiv:2106.11613,2021. [14]LIU X,HU B,CHEN Q,et al.Stroke sequence-dependent deep convolutional neural network for online handwritten Chinese character recognition[J].IEEE Transactions on Neural Networks and Learning Systems,2020,31(11):4637-4648. [15]YU H,WANG X,LI B,et al.Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).Paris:IEEE,2023:11909-11918. [16]LIU B,ZHANG S,HONG Z,et al.A Horizontal Tilt Correction Method for Ship License Numbers Recognition[J].Journal of Physics:Conference Series,2018,976(1):012013. [17]LIU D,CAO J,WANG T,et al.SLPR:A Deep Learning Based Chinese Ship License Plate Recognition Framework[J].IEEE Transactions on Intelligent Transportation Systems,2022,23(12):23831-23843. [18]LIU B,WU S,ZHANG S,et al.Ship License Numbers Recognition Using Deep Neural Networks[J].Journal of Physics:Conference Series,2018,1060(1):012064. [19]ZHANG W,SUN H,ZHOU J,et al.DCNN Based Real-TimeAdaptive Ship License Plate Recognition(DRASLPR)[C]//Proceedings of the IEEE International Conference on Internet of Things(iThings) and IEEE Green Computing and Communications(GreenCom) and IEEE Cyber,Physical and Social Computing(CPSCom) and IEEE Smart Data(SmartData).New York:IEEE,2018:1829-1834. [20]ZHOU C,LIU D,WANG T,et al.M3ANet:Multi-modal and multi-attention fusion network for ship license plate recognition[J].IEEE Transactions on Multimedia,2023,26:5976-5986. [21]WANG W,XIE E,LI X,et al.Pyramid Vision Transformer:A Versatile Backbone for Dense Prediction without Convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).New York:IEEE,2021:548-558. [22]HAN K,XIAO A,WU E,et al.Transformer in transformer[J].Advances in neural information processing systems,2021,34:15908-15919. [23]DOSOVITSKIY A,BEYER L,KOLESBIKOV A,et al.An image is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020. [24]CHEN J,LI B,XUE X.Scene Text Telescope:Text-FocusedScene Image Super-Resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).New York:IEEE,2021:12021-12030. [25]DU Y,CHEN Z,JIA C,et al.SVTR:Scene Text Recognition with a Single Visual Model[C]//Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence.New York:IEEE,2022:884-890. [26]CHENG X,ZHOU W,LI X,et al.VIPTR:A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition[J].arXiv:2401.10110,2024. [27]GRAVES A,FERNANDEZ S,GOMEZ F,et al.Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning.New York:Association for Computing Machinery,2006:369-376. |
|