计算机科学 ›› 2022, Vol. 49 ›› Issue (11A): 211100009-5.doi: 10.11896/jsjkx.211100009
李波燕1, 张勇2, 袁德荣2, 熊堂堂1, 何浪2
LI Bo-yan1, ZHANG Yong2, YUAN De-rong2, XIONG Tang-tang1, HE Lang2
摘要: 作为模式识别的重要分支,手写体数字识别正置于前所未有的热潮之下,卷积神经网络也被广泛应用于相关研究。针对手写体数字识别在训练过程中容易出现梯度爆炸和梯度弥散等现象导致图像识别准确率低的问题,提出了一种嵌入CBAM(Convolutional Block Attention Module)注意力模块的模型,用于手写体数字识别。在卷积神经网络中嵌入CBAM注意力模块,分别从通道和空间维度上筛选出有效特征,抑制无关特征,增强特征的表达能力,提高模型的识别准确率。为进一步提高网络识别准确率,在整个网络架构中充分应用BN(Batch Normalization)算法,加快模型收敛,从而加强模型的抗过拟合能力。在MNIST数据集上进行实验,结果表明,嵌入CBAM注意力模块网络的总体识别准确率达到了99.87%,与一些传统的卷积神经网络模型相比,识别准确率有显著提升。
中图分类号:
[1]CHEN T X.Research on handwritten digit recognition based on integrated convolutional neural network[D].Wuhan:Central China Normal University,2020. [2]DU X,GAO M F.Application of artificial neural network in number recognition[J].Computer System Applications,2007(2):21-22,27. [3]LECUN Y,BOSER B,DENKER J S,et al.Hardwritten digit recognition with a back-propagation network[J].Advances in Neural Information Processing Systems,1900,2(2):369-404. [4]LECUN Y,BOTTOU L,BENGIO Y,et al.Gradient-based learning applied to document recognition[J].Proceedings of the IEEE,1998,86(11):2278-2324. [5]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90. [6]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Computer Vision and Pattern Recognition.2015:1-9. [7]HE K,ZHANG X,REN S,et al.Deep residual learning foriamge recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:770-778. [8]CHOLLET F.Xception:Deep learning with depth-wise separa-ble convolutions[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).IEEE,2017:1800-1807. [9]RU X Q,HUA G G,LI L H,et al.Research on handwritten di-git recognition based on deformable convolutional neural network[J].Microelectronics and Computer,2019,36(4):47-51. [10]MA J Y,MENG X,ZHAO Y.Handwritten digit recognitionbased on spiking neural network[J].Digital Technology and Application,2019,37(5):81-83. [11]YU S X,XIA C X,TANG Z T,et al.Handwritten digit recognition based on improved inception convolutional neural network [J].Computer Applications and Software,2019,36(12):143-149. [12]FU Y Z.Research on handwritten digit recognition methodbased on deep learning[D].Yinchuan:Ningxia University,2020. [13]WOO S,PARK J,LEE J Y,et al.CBAM:convolutional block attention module[C]//European Conference on Computer Vision.Cham:Springer,2018:3-19. [14]IOFFE S,SZEGEDY C.Batch normalization:Accelerating deep network training by reducing internal covariate shift[C]//International Conference on Learning.PMLR,2015:448-456. [15]LI H W,WU Q X.Implementation of neural network activation function in smart sensors[J].Sensors and Microsystems,2014,33(1):46-48. [16]ZHOU F Y,JIN L P,DONG J.Summary of convolutional neural network research[J].Chinese Journal of Computers,2017,40(6):1229-1251. [17]MAAS A L,HANNUN A Y,NG A Y.Rectifier nonlinea-rities improve neural network acoustic models[C]//Proceedings of the 30th International Conference on Machine Learning.Atlanta:ACM,2013:456-462. [18]ZHANG H,ZHANG Q,YU J Y.Overview of the development of activation functions and analysis of their properties[J].Journal of Xihua University(Natural Science Edition),2021,40(4):1-10. [19]NAIR V,HINTON G E.Rectified linear units improve restricted boltzmann machines[C]//Proceedings of the 27th International Conference on Machine Learning(ICML-10).Haifa,Israel:DBLP,2010:807-814. [20]ZUBAIR S,YAN F,WANG W W.Dictionary learning basedsparse coefficients for audio classification with max and average pooling[J].Digital Signal Processing,2013,23(3):960-970. [21]HANG S T,AONO M.Bi-linearly weighted fractional max pooling[J].Multimedia Tools and Applications,2017,76(21):22095-22117. [22]DIETTERICH T G,BAKIRI G.Solving multiclass learningproblems via error-correcting output codes[J].Joural of Artificial Intelligence Research,1995,2(1):263-286. [23]HE X Y,XIONG W,LI Y Q,et al.Handwritten digit recognition based on convolutional neural network[J].Electronic Components and Information Technology,2020,4(7):53-54. [24]LV H.Design of Handwritten digit recognition system based on convolutional neural network[J].Intelligent Computers and Applications,2019,9(2):54-56,62. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[3] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[4] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[7] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[8] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[9] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[10] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[11] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[12] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[13] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[14] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[15] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
|