计算机科学 ›› 2024, Vol. 51 ›› Issue (6A): 230400112-7.doi: 10.11896/jsjkx.230400112

• 人工智能 • 上一篇    下一篇

DRSTN:深度残差软阈值化网络

曹岩, 朱真峰   

  1. 郑州大学计算机与人工智能学院 郑州 450001
  • 发布日期:2024-06-06
  • 通讯作者: 朱真峰(iezfzhu@zzu.edu.cn)
  • 作者简介:(202022172013154@gs.zzu.edu.cn)
  • 基金资助:
    国家自然科学基金面上项目(62176239)

DRSTN:Deep Residual Soft Thresholding Network

CAO Yan, ZHU Zhenfeng   

  1. School of Computer and Artificial Intelligence,Zhengzhou University,Zhengzhou 450001,China
  • Published:2024-06-06
  • About author:CAO Yan,born in 1993,postgraduate.His main research interests include machine learning and transfer learning.
    ZHU Zhenfeng,born in 1980,Ph.D,associate professor,is a member of CCF(No.18743M).His main research interests include data mining,machine learning and computer vision.
  • Supported by:
    National Natural Science Foundation of China(62176239).

摘要: 在采用深度残差等神经网络模型解决图像分类任务时,特征提取过程损失的一些重要特征会影响模型的分类性能。神经网络“端到端”的学习模式带来的黑盒问题,也会限制其在诸多领域的应用和发展。另外,神经网络模型往往需要较长的训练时间。为了提高深度残差网络模型的分类效果和训练效率,引入了模型迁移方法和软阈值化方法,提出了DRSTN(Deep Residual Soft Thresholding Network)网络,并对此网络结构进行微调,生成了不同版本的DRSTN网络。DRSTN网络的性能得益于3个方面的有机整合:1)通过梯度加权类激活映射(Gradients-weighted Class Activation Mapping,Grad-CAM)方法对网络的特征提取进行可视化,根据可视化结果挑选进一步优化的模型;2)基于模型迁移,研究人员不必全新地搭建模型,可以直接在已有的模型上进行优化,能够节省大量训练时间;3)软阈值化作为非线性变换层嵌入到深度残差网络体系结构中,以消除样本中不相关的特征。实验结果表明,在相同训练条件下,DRSTN_KS(3*3)_RB(2:2:2)网络在CIFAR-10数据集上的分类精度相比SKNet-18,ResNet18和ConvNeXt_tiny网络分别提高了15.5%,8.8%和10.9%;该网络也具有一定的泛化性,在MNIST和Fashion MNIST数据集上能够达到快速的迁移效果,分类精度分别达到99.06%和93.15%。

关键词: 迁移学习, 残差网络, 梯度加权类激活映射, 软阈值化方法, 图像分类

Abstract: When using neural network models such as deep residuals to classify images,some important features lost during feature extraction will affect the classification performance of the model.The black box problem brought about by the “end-to-end” learning mode of neural network can also limit its application and development in many fields.In addition,neural network models often require longer training time than traditional methods.In order to improve the classification effect and training efficiency of the deep residual networks,this paper introduces the model transfer method and soft thresholding method,proposes the deep residual soft thresholding network(DRSTN) network,and fine-tunes the network structure to generate different versions DRSTN network.The performance of the DRSTN networks benefit from the organic integration of three aspects:1)Visualize the feature extraction of the network through the gradients-weighted class activation mapping(Grad-CAM) method,and select further optimized ones based on the visualization results.2)Based on model transfer,researchers do not need to build a model from scratch,and can directly optimize the existing models,which can save a lot of training time.3) Soft thresholding,as a nonlinear transformation layer,is embedded into the deep residual network architecture to eliminate irrelevant features in samples.Experimental results show that under the same training conditions,the classification accuracy of the DRSTN_KS(3*3)_RB(2:2:2) network on the CIFAR-10 dataset is 15.5%,8.8% and 10.9% higher than that of SKNet-18,ResNet18 and ConvNeXt_tiny networks,respectively.The network also has a certain degree of generalization.It can achieve rapid transfer on MNIST and Fashion MNIST datasets,and the classification accuracy reaches 99.06% and 93.15% respectively.

Key words: Transfer learning, Residual network, Gradient weighted class activation mapping, Soft thresholding method, Image classification

中图分类号: 

  • TP391
[1]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[2]HASSANZADEH T,ESSAM D,SARKER R.EvoDCNN:Anevolutionary deep convolutional neural network for image classification[J].Neurocomputing,2022,488:271-283.
[3]FADLULLAH Z M,TANG F,MAO B,et al.State-of-the-art deep learning:Evolving machine intelligence toward tomorrow’s intelligent network traffic control systems[J].IEEE Communications Surveys & Tutorials,2017,19(4):2432-2455.
[4]ZHANG Z,GEIGER J,POHJALAINEN J,et al.Deep learning for environmentally robust speech recognition:An overview of recent developments[J].ACM Transactions on Intelligent Systems and Technology,2018,9(5):1-28.
[5]DENG J,GUO J,XUE N,et al.Arcface:Additive angular margin loss for deep face recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:4690-4699.
[6]DU J,CHEN Y H,ZHANG L,et al.Low-power expression recognition based on improved deep residual network[J].Computer Science,2018,45(9):303-307.
[7]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenetclassification with deep convolutional neural networks[J].Communications of the ACM,2017,60(6):84-90.
[8]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2015.
[9]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:1-9.
[10]HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[11]HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely con-nected convolutional networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4700-4708.
[12]HASSANPOUR M,MALEK H.Document image classificationusing SqueezeNet convolutional neural network[C]//2019 5th Iranian Conference on Signal Processing and Intelligent Systems.IEEE,2019:1-4.
[13]LI X,WANG W,HU X,et al.Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:510-519.
[14]DOSOVITSKIY A,BEYER L,KOLESNIKOV A,et al.Animage is worth 16x16 words:Transformers for image recognition at scale[J].arXiv:2010.11929,2021.
[15]LIU Z,LIN Y,CAO Y,et al.Swin transformer:Hierarchical vi-sion transformer using shifted windows[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision.2021:10012-10022.
[16]LIU Z,MAO H,WU C Y,et al.A convnet for the 2020s[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:11976-11986.
[17]SELVARAJU R R,COGSWELL M,DAS A,et al.Grad-cam:Visual explanations from deep networks via gradient-based localization[C]//Proceedings of the IEEE International Confe-rence on Computer Vision.2017:618-626.
[18]PAN S J,YANG Q.A survey on transfer learning[J].IEEE Transactions on Knowledge and Data Engineering,2010,22(10):1345-1359.
[19]ISOGAWA K,IDA T,SHIODERA T,et al.Deep shrinkageconvolutional neural network for adaptive noise reduction[J].IEEE Signal Processing Letters,2017,25(2):224-228.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!