计算机科学 ›› 2020, Vol. 47 ›› Issue (10): 187-193.doi: 10.11896/jsjkx.191000035

• 计算机图形学&多媒体 • 上一篇    下一篇

基于可变形卷积神经网络的数字仪表识别方法

郭兰英, 韩睿之, 程鑫   

  1. 长安大学信息工程学院 西安710064
  • 收稿日期:2019-10-09 修回日期:2020-03-12 出版日期:2020-10-15 发布日期:2020-10-16
  • 通讯作者: 韩睿之(ruizhih@qq.com)
  • 作者简介:lyguo@chd.edu.cn
  • 基金资助:
    陕西省重点研发计划(2019NY-163);陕西省交通科技项目(14-23K);中央高校基本科研业务费专项资金项目(300102329101,310824175004)

Digital Instrument Identification Method Based on Deformable Convolutional Neural Network

GUO Lan-ying, HAN Rui-zhi, CHENG Xin   

  1. School of Information Engineering,Chang’an University,Xi’an 710064,China
  • Received:2019-10-09 Revised:2020-03-12 Online:2020-10-15 Published:2020-10-16
  • About author:GUO Lan-ying,born in 1963,professor.Her main research interests include intelligent transportation system and so on.
    HAN Rui-zhi,born in 1995,postgra-duate,is a member of China Computer Federation.His main research interests include deep learning and computer vision.
  • Supported by:
    Shaanxi Provincial Key Research and Development Program(2019NY-163),Shaanxi Provincial Transportation Science and Technology Project(14-23K) and Central University Basic Research Business Expenses Special Fund Project (300102329101,310824175004)

摘要: 目前,对于数显仪表的识别,多采用传统的图像处理及机器学习等方法,在复杂多变的应用场景中,其对字符、数字的识别准确率低,难以满足实时应用的要求。针对以上问题,将传统图像处理技术与深度学习方法相结合,提出了一种基于可变形卷积神经网络的数显仪表示数分割与识别方法。该方法包含图像预处理、字符分割与识别等步骤。首先,使用GrayWorld算法对待识别图像进行亮度均衡,并通过彩色分割提取屏幕区域;其次,对图像进行形态学操作,以便使用投影直方图法完成字符与对应小数点的整体分割;最后,设计并训练了一种可变形卷积神经网络对字符进行识别,优化了卷积神经网络感受野几何结构固定的内在问题。实验结果表明,加入可变形卷积有效提高了图像的识别准确率和网络的收敛速度;该方法的整体识别准确率达到99.45%,检测速度为10FPS,能够满足实际应用需求。

关键词: 可变形卷积神经网络, 投影直方图, 图像处理, 字符识别

Abstract: At present,traditional image processing methods and machine learning methods are adopted for the identification of digital display instruments,which have disadvantages such as low recognition accuracy for both characters and numbers in complicated scenarios,and difficulty to meet real-time application requirements.Aiming at the problems above,combining traditional image processing technology and deep learning methods,a method of segmentation and recognition of digital display instrument based on deformable convolutional neural network is proposed.This method includes steps such as image preprocessing,character segmentation and image recognition.Firstly,the GrayWorld algorithm is applied to perform brightness equalization on the image to be recognized for the further using of color segmentation to extract the screen area.Secondly,the projected histogram method is implemented to realize the unified segmentation of characters with its corresponding decimal point after performing morphological operation on the image.Finally,a deformable convolutional neural network is proposed and trained for character recognition,which optimizes the endogenous geometry restriction of receptive field in convolutional neural networks.The experimental results indicate that the addition of deformable convolution effectively improves the accuracy of image recognition and the convergence speed of the network,and the accuracy of the overall recognition method reaches 99.45% and the detection speed is 10FPS,which can meet the requirements of practical applications.

Key words: Character recognition, Deformable convolutional neural network, Image processing, Projection histogram

中图分类号: 

  • TP391.4
[1]CUI W X,CUI Y C,WANG Z H,et al.License plate character segmentation algorithm based on template matching and vertical projection[J].Journal of Qiqihar University(Natural Science Edition),2015,31(6):12-16.
[2]LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[3]DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//IEEE Computer Society Conference on Computer Vision and Pattern Recognition(CVPR).San Diego,USA:IEEE,2002,1:886-893.
[4]DENG J,DONG W,SOCHER R,et al.ImageNet:A Large-Scale Hierarchical Image Database[C]//IEEE Computer Vision and Pattern Recognition (CVPR).2009.
[5]RUSSAKOVSKY O,DENG J,et al.ImageNet Large Scale Visual Recognition Challenge[J].International Journal of Computer Vision(IJCV),2015,115:211-252.
[6]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-basedlearning applied to document recognition[J].Proceedings of the IEEE,1998,86(11),2278-2324.
[7]LECUN Y,CORTES C.MNIST handwritten digit database[EB/OL].http://yann.lecun.com/exdb/mnist,2010.
[8]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolution neural network[C]//Advances in Neural Information Processing System.Cambridge:MIT Press,2012:1097-1105.
[9]LECUN Y,BENGIO Y.The Handbook of Brain Theory and Neural Networks[M].Cambridge:MIT Press,1998:255-258.
[10]XIE S,GIRSHICK R,DOLLAR P,et al.Aggregated ResidualTransformations for Deep Neural Networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE Computer Society,2017.
[11]LIN T Y,GOYAL P,GIRSHICK R,et al.Focal loss for dense object detection[C]//ICCV.2017.
[12]DAI J,LI Y,HE K,et al.R-fcn:Object detection via region-based fully convolutional networks[C]//NeurIPS.2016.
[13]GIRSHICK R,DONAHUE J,DARRELL T,et al.Rich feature hierarchies for accurate object detection and semantic segmentation[C]//CVPR.2014.
[14]HE K,GKIOXARI G,DOLL′AR P,et al.Mask r-cnn[C]//ICCV.2017.
[15]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).Boston,USA,2015:1-9.
[16]LIN M,CHEN Q,YAN S.Network in network[EB/OL].http://arxiv.org/abs/1312.4400,2013.
[17]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[18]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//CVPR.2016.
[19]BOUREAU Y L,PONCE J,LECUN Y.A theoretical analysis of feature pooling in visual recognition[C]//ICML.2010.
[20]FELZENSZWALB P F,GIRSHICK R B,MCALLESTER D,et al.Object detection with discriminatively trained part based models[C]//TPAMI.2010.
[21]JADERBERG M,SIMONYAN K,ZISSERMAN A,et al.Spatial transformer networks[C]//NIPS.2015.
[22]LUO W,LI Y,URTASUN R,et al.Understanding the effective receptive field in deep convolutional neural networks[J].arXiv:1701.04128,2017.
[23]YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions[C]//ICLR.2016.
[24]DAI J F,QI H Z,XIONG Y W.Deformable convolutional networks[C]//Proceedings of the IEEE International Conference on Computer Vision.Venice,Italy:IEEE,2017:764-773.
[25]WANG F,XIANG D.Digital instrument identification methodbased on convolutional neural network[J].Machine Design and Manufacturing Engeering,2018,9(47):63-66.
[1] 郭拯危, 付泽文, 李宁, 白澜.
高分辨率斜视聚束SAR回波仿真加速算法研究
Study on Acceleration Algorithm for Raw Data Simulation of High Resolution Squint Spotlight SAR
计算机科学, 2022, 49(8): 178-183. https://doi.org/10.11896/jsjkx.210600066
[2] 刘伟业, 鲁慧民, 李玉鹏, 马宁.
指静脉识别技术研究综述
Survey on Finger Vein Recognition Research
计算机科学, 2022, 49(6A): 1-11. https://doi.org/10.11896/jsjkx.210400056
[3] 来腾飞, 周海洋, 余飞鸿.
视频流的实时景深延拓算法
Real-time Extend Depth of Field Algorithm for Video Processing
计算机科学, 2022, 49(6A): 314-318. https://doi.org/10.11896/jsjkx.201100187
[4] 詹瑞, 雷印杰, 陈训敏, 叶书函.
基于多重差异特征网络的街景变化检测
Street Scene Change Detection Based on Multiple Difference Features Network
计算机科学, 2021, 48(2): 142-147. https://doi.org/10.11896/jsjkx.200500158
[5] 张育龙, 王强, 陈明康, 孙静涛.
图像去雨算法在云物联网应用中的研究综述
Survey of Intelligent Rain Removal Algorithms for Cloud-IoT Systems
计算机科学, 2021, 48(12): 231-242. https://doi.org/10.11896/jsjkx.201000055
[6] 寇喜超, 张鸿锐, 冯杰, 郑雅羽.
基于多级文本检测的复杂文档图像扭曲矫正算法
Distortion Correction Algorithm for Complex Document Image Based on Multi-level TextDetection
计算机科学, 2021, 48(12): 249-255. https://doi.org/10.11896/jsjkx.200700072
[7] 姚楠, 张征.
基于三维图像的疤痕面积计算
Scar Area Calculation Based on 3D Image
计算机科学, 2021, 48(11A): 308-313. https://doi.org/10.11896/jsjkx.201100044
[8] 冯一凡, 赵雪青, 师昕, 杨坤.
基于光照叠加的颜色恒常计算方法
Light Superposition-based Color Constancy Computational Method
计算机科学, 2021, 48(11A): 386-390. https://doi.org/10.11896/jsjkx.210200053
[9] 宋一言, 唐东林, 吴续龙, 周立, 秦北轩.
改进穿线法与HOG+SVM结合的数码管图像读数研究
Study on Digital Tube Image Reading Combining Improved Threading Method with HOG+SVM Method
计算机科学, 2021, 48(11A): 396-399. https://doi.org/10.11896/jsjkx.210100123
[10] 谢海平, 李高源, 杨海涛, 赵洪利.
超分辨率重构遥感图像分类研究
Classification Research of Remote Sensing Image Based on Super Resolution Reconstruction
计算机科学, 2021, 48(11A): 424-428. https://doi.org/10.11896/jsjkx.210300132
[11] 宋娅菲, 谌雨章, 沈君凤, 曾张帆.
基于改进残差网络的水下图像重建方法
Underwater Image Reconstruction Based on Improved Residual Network
计算机科学, 2020, 47(6A): 500-504. https://doi.org/10.11896/JsJkx.200100084
[12] 蔡玉鑫, 汤志伟, 赵博, 杨明, 吴禹非.
基于嵌入式多核DSP的加速软件系统
Accelerated Software System Based on Embedded Multicore DSP
计算机科学, 2020, 47(6A): 622-625. https://doi.org/10.11896/JsJkx.190400079
[13] 马虹.
基于5G的视觉辅助BDS移动机器人融合定位算法
Fusion Localization Algorithm of Visual Aided BDS Mobile Robot Based on 5G
计算机科学, 2020, 47(6A): 631-633. https://doi.org/10.11896/JsJkx.190400156
[14] 苗益, 赵增顺, 杨雨露, 徐宁, 杨皓然, 孙骞.
图像描述技术综述
Survey of Image Captioning Methods
计算机科学, 2020, 47(12): 149-160. https://doi.org/10.11896/jsjkx.200500039
[15] 凌晨, 张鑫彤, 马雷.
基于Mask R-CNN算法的遥感图像处理技术及其应用
Remote Sensing Image Processing Technology and Its Application Based on Mask R-CNN Algorithms
计算机科学, 2020, 47(10): 151-160. https://doi.org/10.11896/jsjkx.190900119
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!