计算机科学 ›› 2024, Vol. 51 ›› Issue (11A): 240100031-9.doi: 10.11896/jsjkx.240100031

• 图像处理&多媒体技术 • 上一篇    下一篇

面向汉字点选验证码的轻量级高效识别方法

金鑫豪, 池凯凯   

  1. 浙江工业大学计算机科学与技术学院 杭州 310013
  • 出版日期:2024-11-16 发布日期:2024-11-13
  • 通讯作者: 池凯凯(kkchi@zjut.edu.cn)
  • 作者简介:(xinhaojin@qq.com)
  • 基金资助:
    国家自然科学基金面上项目(62272414)

Lightweight and Efficient Recognition Method for Chinese Character Click-based CAPTCHA

JIN Xinhao, CHI Kaikai   

  1. School of Computer Science and Technology,Zhejiang University of Technology,Hangzhou 310013,China
  • Online:2024-11-16 Published:2024-11-13
  • About author:JIN Xinhao, born in 1998, postgraduate.His main research interest is robot process automation.
    CHI Kaikai,born in 1980,Ph.D,professor,Ph.D supervisor,is a member of CCF(No.72583S).His main research interests include wireless networks and machine learning.
  • Supported by:
    General Program of the National Natural Science Foundation of China(62272414).

摘要: 数字化浪潮下,企业日益依赖机器人流程自动化(Robot Process Automation,RPA)技术来降低成本、提高效率,以保持竞争力。但流程中部分环节面临汉字点选验证码识别的难题,限制了自动化水平的进一步提高。现有研究方案存在数据集制作难度大、模型泛化性能差、模型复杂度与性能之间不平衡等问题。为此,提出一种数据集制作成本低、模型泛化性能好且轻量化的汉字点选验证码识别方法。具体而言:首先采用经过针对性改进的YOLOv8-n显著轻量化汉字检测模型,然后对汉字图片进行分割、矫正等预处理操作,接着采用泛化性强的PaddleOCR模型进行汉字识别,降低了场景迁移的成本,并通过识别概率矩阵得到最佳匹配结果,进一步提高了准确率。此外,设计了一种半自动的汉字检测数据集构建流程并公开了数据集。该研究旨在推动汉字点选验证码的自动识别技术的发展,促进企业流程自动化水平的提升。

关键词: 流程自动化, 验证码识别, YOLOv8, PaddleOCR, 轻量化

Abstract: With the advent of digitalization,enterprises increasingly rely on robotic process automation technologies to reduce costs and improve efficiency,thus maintaining competitiveness.However,the automation level is hindered by the challenge of Chinese character click-based CAPTCHA recognition in certain process steps.Existing research on this problem faces difficulties in dataset creation,poor model generalization performance,and an imbalance between model complexity and performance.To address these issues,this paper proposes a low-cost dataset creation approach and a lightweight Chinese character click-based CAPTCHA recognition method with excellent generalization performance.Specifically,a significantly lightweight version of the YOLOv8-n model,tailored for Chinese cha-racter detection,is employed in this study.Subsequently,preprocessing operations such as segmentation and rectification are applied to the CAPTCHA images.The highly versatile PaddleOCR model is utilized for Chinese character recognition,reducing the cost of scene adaptation.Furthermore,the best matching result is obtained through the recognition probability matrix,further enhancing accuracy.Additionally,a semi-automatic Chinese character detection dataset construction process is designed and made publicly available.This research aims to promote the development of automated Chinese character click-based CAPTCHA recognition techniques,enhance the level of enterprise process automation.

Key words: Process automation, Verification code recognition, YOLOv8, PaddleOCR, Lightweight

中图分类号: 

  • TP391
[1]ENRÍQUEZ J G,JIMÉNEZ-RAMÍREZ A,DOMÍNGUEZ-MAYO F J,et al.Robotic process automation:a scientific and industrial systematic mapping tudy[J].IEEE Access,2020,8:39113-39129.
[2]LIANG Y.Research on the Application of Financial RobotProcess Automation Based on Machine Learning[C]//2021 International Conference on Electronic Information Technology and Smart Agriculture(ICEITSA).IEEE,2021:474-477.
[3]HANDOKO B L,LINDAWATI A S L,MUSTAPHA M.Robotic process automation in audit 4.0[C]//The 2021 12th International Conference on E-business,Management and Econo-mics.2021:128-132.
[4]BRANDSTATTER C,TSCHANDL M,MITTERBACK C.AGeneric Process Model for the Introduction of Robotic Process Automation in Financial Accounting[C]//Proceedings of the 2023 9th International Conference on Computer Technology Applications.2023:12-18.
[5]AHMET UNAL M,BOLUKBAS O.The Acquirements of Digitalization with RPA(Robotic Process Automation) Technology in the Vakif Participation Bank[C]//Proceedings of the 4th International Conference on Information Science and Systems.2021:68-73.
[6]LIU S,QI X,LI H.Practice of Robot Process Automation in Power Grid Dispatching Report[C]//Proceedings of the 2022 4th International Conference on Robotics,Intelligent Control and Artificial Intelligence.2022:212-216.
[7]RATIA M,MYLLÄRNIEMI J,HELANDER N.Roboticprocess automation-creating value by digitalizing work in the private healthcare?[C]//Proceedings of the 22nd International Academic Mindtrek Conference.2018:222-227.
[8]SGANDERLA R B,FANTINATO M,THOM L H.RoboticProcess Automation in Latin American Organizations:Survey and Evaluation of the Current State of Technology Adoption[C]//Proceedings of the XIX Brazilian Symposium on Information Systems.2023:459-467.
[9]KEDZIORA D,HYRYNSALMI S.Turning Robotic ProcessAutomation onto Intelligent Automation with Machine Learning[C]//Proceedings of the 11th International Conference on Communities and Technologies.2023:1-5.
[10]KHOLIYA P S,KAPOOR A,RANA M,et al.Intelligentprocess automation:The future of digital transformation[C]//10th International Conference on System Modeling & Advancement in Research Trends(SMART 2021).IEEE,2021:185-190.
[11]ROTHER C,KOLMOGOROV V,BLAKE A.“GrabCut” interactive foreground extraction using iterated graph cuts[J].ACM Transactions on Graphics(TOG),2004,23(3):309-314.
[12]DU Y,LI C,GUO R,et al.Pp-ocr:A practical ultra lightweight ocr system[J].arXiv:2009.09941,2020.
[13]BOSTIK O,KLECKA J.Recognition of CAPTCHA characters by supervised machine learning algorithms[J].IFAC-Papers OnLine,2018,51(6):208-213.
[14]SACHDEV S.Breaking captcha characters using multi-tasklearning cnn and svm[C]//4th International Conference on Computational Intelligence and Networks(CINE 2020).IEEE,2020:1-6.
[15]ZHANG N,EBRAHIMI M,LI W,et al.Counteracting darkWeb text-based CAPTCHA with generative adversarial learning for proactive cyber threat intelligence[J].ACM Transactions on Management Information Systems(TMIS),2022,13(2):1-21.
[16]THOBHANI A,GAO M,HAWBANI A,et al.CAPTCHA re-cognition using deep learning with attached binary images[J].Electronics,2020,9(9):1522.
[17]WU X,DAI S,GUO Y,et al.A machine learning attack againstvariable-length Chinese character CAPTCHAs[J].Applied Intelligence,2019,49:1548-1565.
[18]LUAN S,CHEN C,ZHANG B,et al.Gabor convolutional net-works[J].IEEE Transactions on Image Processing,2018,27(9):4357-4366.
[19]WANG J,QIN J,XIANG X,et al.CAPTCHA recognition based on deep convolutional neural network[J].Math.Biosci.Eng,2019,16(5):5851-5861.
[20]ZHANG X,LIU X,SARKODIE-GYAN T,et al.Development of a character CAPTCHA recognition system for the visually impaired community using deep learning[J].Machine Vision and Applications,2021,32:1-19.
[21]BI X,LIU X.Chinese Character Captcha Sequential SelectionSystem Based on Convolutional Neural Network[C]//International Conference on Computer Vision,Image and Deep Lear-ning(CVIDL 2020).IEEE,2020:554-559.
[22]GIRSHICK R.Fast r-cnn[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1440-1448.
[23]CAVNAR W B,TRENKLEJ M.N-gram-based text categorization[C]//3rd Annual Symposium on Document Analysis and Information Retrieval(SDAIR-94).1994:161175.
[24]HU J.Research on Security of Chinese Point-and-Click CAP-TCHA[D].Xi'an:Xidian University,2018.
[25]YOU X.Research on Chinese Character Captcha RecognitionBased on YOLO V2 [D].Chengdu:Chengdu University of Technology,2019.
[26]REDMON J,DIVVALA S,GIRSHICK R,et al.You only look once:Unified,real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:779-788.
[27]LIU W,ANGUELOV D,ERHAND,et al.Ssd:Single shotmultibox detector[C]//Computer Vision-ECCV 2016:14th European Conference,Amsterdam,The Netherlands,Part I 14.Springer International Publishing,2016:21-37.
[28]WANG C Y,BOCHKOVSKIY A,LIAOH Y M.YOLOv7:Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors[C]//Proceedings of the IEEE/CVF Confe-rence on Computer Vision and Pattern Recognition.2023:7464-7475.
[29]GE Z,LIU S,WANG F,et al.Yolox:Exceeding yolo series in 2021[J].arXiv:2107.08430,2021.
[30]XU S,WANG X,LV W,et al.PP-YOLOE:An evolved version of YOLO[J].arXiv:2203.16250,2022.
[31]LI C,LI L,JIANG H,et al.YOLOv6:A single-stage object detection framework for industrial applications[J].arXiv:2209.02976,2022.
[32]CHOLLET F.Xception:Deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:1251-1258.
[33]HOWARD A G,ZHU M,CHEN B,et al.Mobilenets:Efficient convolutional neural networks for mobile vision applications[J].arXiv:1704.04861,2017.
[34]ZHANG X,ZHOU X,LIN M,et al.Shufflenet:An extremelyefficient convolutional neural network for mobile devices[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:6848-6856.
[35]KOONCE B.EfficientNet[M]//Convolutional Neural Networks with Swift for Tensorflow:Image Recognition and Dataset Categorization.2021:109-123.
[36]HAN K,WANG Y,TIAN Q,et al.Ghostnet:More featuresfrom cheap operations[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2020:1580-1589.
[37]REYNOLDSD A.Gaussian mixture models[J].Encyclopedia of Biometrics,2009,741:659-663.
[38]ZHOU X,YAO C,WEN H,et al.East:an efficient and accurate scene text detector[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:5551-5560.
[39]SHI B,BAI X,YAO C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,39(11):2298-2304.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!