Computer Science ›› 2025, Vol. 52 ›› Issue (6): 179-186.doi: 10.11896/jsjkx.240500064

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Ship License Plate Recognition Network Based on Pyramid Transformer in Transformer

WANG Teng1, XIAN Yunting1, XU Hao1, XIE Songqi1, ZOU Quanyi2   

  1. 1 School of Computer Science and Engineering,South China University of Technology,Guangzhou 510006,China
    2 School of Journalism and Communication,South China University of Technology,Guangzhou 510006,China
  • Received:2024-05-16 Revised:2024-09-05 Online:2025-06-15 Published:2025-06-11
  • About author:WANG Teng,born in 2000,postgradua-te.His main research interests include image processing and deep learning.
    XIAN Yunting,born in 1982,Ph.D,lab master.His main research interests include artificial intelligence and image processing.
  • Supported by:
    Guangdong Philosophy and Social Science Foundation Regular Project(GD24YXW02) and Youth Innovative Talent Projects of Guangdong Universities(2023KQNC005).

Abstract: Ship identification is of great significance and widely used in the regulation of waterborne targets.As one of the important components of ship identification,accurate identification of ship name can make up for the shortcomings of traditional AIS identification methods and improve the accuracy of ship identification.Compared with the traditional Chinese text recognition,due to the complex water environment,large changes in light,serious corrosion of ship hulls,and non-standardized ship names,ship name images have low clarity,text mutilation,inconsistent font styles and other problems,which make ship name recognition difficult and low accuracy.In this paper,a lightweight recognition network based on Pyramid Transformer in Transformer is proposed to solve the problems in ship name recognition.Firstly,the input image is processed by a spatial transform network to correct the tilt of the ship name.Then,the Transformer in Transformer module is utilized to efficiently extract the multi-granularity features of the image.Finally,the text and radical are recognized at different scales.Experimental results show that the proposed algorithm has excellent performance in ship name recognition compared with other text recognition methods.The accuracy reaches 92.68% on CSLD dataset,94.50% on SCSLD dataset,and 66.34% on DCSLD dataset.At the same time,this method is characterized by a low number of parameters and a high frame rate.

Key words: Chinese text recognition, Ship license plate recognition, Deep learning, Scene text recognition, Transformer

CLC Number: 

  • TP183
[1]JIN L W,YIN J X,GAO X,et al.Study of Several directional feature extraction methods with local elastic meshing technology for HCCR[C]//Proceedings of the 6th International Conference for Young Computer Scientist.Hong Kong:International Academic Publishers,World Publishing Corporation,2001:232-236.
[2]SU Y M,WANG J F.A novel stroke extraction method for Chinese characters using Gabor filters[J].Pattern Recognition,2003,36(3):635-647.
[3]CHANG F.Techniques for Solving the Large-Scale Classification Problem in Chinese Handwriting Recognition[M].Berlin:Springer,2008:161-169.
[4]YU H,CHEN J,LI B,et al.Benchmarking Chinese Text Recognition:Datasets,Baselines,and an Empirical Study[J].arXiv:2112.15093,2021.
[5]SHI B,BAI X,YAO C.An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304.
[6]SHI B,YANG M,WANG X,et al.ASTER:An AttentionalScene Text Recognizer with Flexible Rectification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(9):2035-2048.
[7]LU N,YU W,QI X,et al.MASTER:Multi-aspect non-localnetwork for scene text recognition[J].Pattern Recognition,2021,117:107980.
[8]FANG S,XIE H,WANG Y,et al.Read like humans:Autonomous,bidirectional and iterative language modeling for scene text recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.New York:IEEE Computer Society,2021:7098-7107.
[9]WANG W,ZHANG J,DU J,et al.DenseRAN for Offline Handwritten Chinese Character Recognition[C]//Proceedings of the 16th International Conference on Frontiers in Handwriting Recognition(ICFHR).New York:IEEE,2018:104-109.
[10]WANG T,XIE Z,LI Z,et al.Radical aggregation network for few-shot offline handwritten Chinese character recognition[J].Pattern Recognition Letters,2019,125:821-827.
[11]DENG X,HUANG Z,MA K,et al.RRecT:Chinese Text Recognition with Radical-Enhanced Recognition Transformer[C]//Proceedings of the International Conference on Artificial Neural Networks and Machine Learning - ICANN 2023.Berlin:Springer,2023:509-521.
[12]CAO Z,LU J,CUI S,et al.Zero-shot Handwritten ChineseCharacter Recognition with hierarchical decomposition embedding[J].Pattern Recognition,2020,107:107488.
[13]CHEN J,LI B,XUE X.Zero-shot Chinese character recognition with stroke-level decomposition[J].arXiv:2106.11613,2021.
[14]LIU X,HU B,CHEN Q,et al.Stroke sequence-dependent deep convolutional neural network for online handwritten Chinese character recognition[J].IEEE Transactions on Neural Networks and Learning Systems,2020,31(11):4637-4648.
[15]YU H,WANG X,LI B,et al.Chinese Text Recognition with A Pre-Trained CLIP-Like Model Through Image-IDS Aligning[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).Paris:IEEE,2023:11909-11918.
[16]LIU B,ZHANG S,HONG Z,et al.A Horizontal Tilt Correction Method for Ship License Numbers Recognition[J].Journal of Physics:Conference Series,2018,976(1):012013.
[17]LIU D,CAO J,WANG T,et al.SLPR:A Deep Learning Based Chinese Ship License Plate Recognition Framework[J].IEEE Transactions on Intelligent Transportation Systems,2022,23(12):23831-23843.
[18]LIU B,WU S,ZHANG S,et al.Ship License Numbers Recognition Using Deep Neural Networks[J].Journal of Physics:Conference Series,2018,1060(1):012064.
[19]ZHANG W,SUN H,ZHOU J,et al.DCNN Based Real-TimeAdaptive Ship License Plate Recognition(DRASLPR)[C]//Proceedings of the IEEE International Conference on Internet of Things(iThings) and IEEE Green Computing and Communications(GreenCom) and IEEE Cyber,Physical and Social Computing(CPSCom) and IEEE Smart Data(SmartData).New York:IEEE,2018:1829-1834.
[20]ZHOU C,LIU D,WANG T,et al.M3ANet:Multi-modal and multi-attention fusion network for ship license plate recognition[J].IEEE Transactions on Multimedia,2023,26:5976-5986.
[21]WANG W,XIE E,LI X,et al.Pyramid Vision Transformer:A Versatile Backbone for Dense Prediction without Convolutions[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision(ICCV).New York:IEEE,2021:548-558.
[22]HAN K,XIAO A,WU E,et al.Transformer in transformer[J].Advances in neural information processing systems,2021,34:15908-15919.
[23]DOSOVITSKIY A,BEYER L,KOLESBIKOV A,et al.An image is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2020.
[24]CHEN J,LI B,XUE X.Scene Text Telescope:Text-FocusedScene Image Super-Resolution[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).New York:IEEE,2021:12021-12030.
[25]DU Y,CHEN Z,JIA C,et al.SVTR:Scene Text Recognition with a Single Visual Model[C]//Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence.New York:IEEE,2022:884-890.
[26]CHENG X,ZHOU W,LI X,et al.VIPTR:A Vision Permutable Extractor for Fast and Efficient Scene Text Recognition[J].arXiv:2401.10110,2024.
[27]GRAVES A,FERNANDEZ S,GOMEZ F,et al.Connectionist temporal classification:labelling unsegmented sequence data with recurrent neural networks[C]//Proceedings of the 23rd International Conference on Machine Learning.New York:Association for Computing Machinery,2006:369-376.
[1] ZHOU Lei, SHI Huaifeng, YANG Kai, WANG Rui, LIU Chaofan. Intelligent Prediction of Network Traffic Based on Large Language Model [J]. Computer Science, 2025, 52(6A): 241100058-7.
[2] LONG Xiao, HUANG Wei, HU Kai. Bi-MI ViT:Bi-directional Multi-level Interaction Vision Transformer for Lung CT ImageClassification [J]. Computer Science, 2025, 52(6A): 240700183-6.
[3] GUAN Xin, YANG Xueyong, YANG Xiaolin, MENG Xiangfu. Tumor Mutation Prediction Model of Lung Adenocarcinoma Based on Pathological [J]. Computer Science, 2025, 52(6A): 240700010-8.
[4] TAN Jiahui, WEN Chenyan, HUANG Wei, HU Kai. CT Image Segmentation of Intracranial Hemorrhage Based on ESC-TransUNet Network [J]. Computer Science, 2025, 52(6A): 240700030-9.
[5] CHEN Xianglong, LI Haijun. LST-ARBunet:An Improved Deep Learning Algorithm for Nodule Segmentation in Lung CT Images [J]. Computer Science, 2025, 52(6A): 240600020-10.
[6] RAN Qin, RUAN Xiaoli, XU Jing, LI Shaobo, HU Bingqi. Function Prediction of Therapeutic Peptides with Multi-coded Neural Networks Based on Projected Gradient Descent [J]. Computer Science, 2025, 52(6A): 240800024-6.
[7] FAN Xing, ZHOU Xiaohang, ZHANG Ning. Review on Methods and Applications of Short Text Similarity Measurement in Social Media Platforms [J]. Computer Science, 2025, 52(6A): 240400206-8.
[8] YANG Jixiang, JIANG Huiping, WANG Sen, MA Xuan. Research Progress and Challenges in Forest Fire Risk Prediction [J]. Computer Science, 2025, 52(6A): 240400177-8.
[9] PIAO Mingjie, ZHANG Dongdong, LU Hu, LI Rupeng, GE Xiaoli. Study on Multi-agent Supply Chain Inventory Management Method Based on Improved Transformer [J]. Computer Science, 2025, 52(6A): 240500054-10.
[10] WANG Jiamin, WU Wenhong, NIU Hengmao, SHI Bao, WU Nier, HAO Xu, ZHANG Chao, FU Rongsheng. Review of Concrete Defect Detection Methods Based on Deep Learning [J]. Computer Science, 2025, 52(6A): 240900137-12.
[11] HAO Xu, WU Wenhong, NIU Hengmao, SHI Bao, WU Nier, WANG Jiamin, CHU Hongkun. Survey of Man-Machine Distance Detection Method in Construction Site [J]. Computer Science, 2025, 52(6A): 240700098-10.
[12] ZOU Ling, ZHU Lei, DENG Yangjun, ZHANG Hongyan. Source Recording Device Verification Forensics of Digital Speech Based on End-to-End DeepLearning [J]. Computer Science, 2025, 52(6A): 240800028-7.
[13] CHEN Shijia, YE Jianyuan, GONG Xuan, ZENG Kang, NI Pengcheng. Aircraft Landing Gear Safety Pin Detection Algorithm Based on Improved YOlOv5s [J]. Computer Science, 2025, 52(6A): 240400189-7.
[14] GAO Junyi, ZHANG Wei, LI Zelin. YOLO-BFEPS:Efficient Attention-enhanced Cross-scale YOLOv10 Fire Detection Model [J]. Computer Science, 2025, 52(6A): 240800134-9.
[15] LI Yang, LIU Yi, LI Hao, ZHANG Gang, XU Mingfeng, HAO Chongqing. Human Pose Estimation Using Millimeter Wave Radar Based on Transformer and PointNet++ [J]. Computer Science, 2025, 52(6A): 240400169-9.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!