计算机科学 ›› 2018, Vol. 45 ›› Issue (12): 177-181.doi: 10.11896/j.issn.1002-137X.2018.12.028
郑宗生, 刘兆荣, 黄冬梅, 宋巍, 邹国良, 侯倩, 郝剑波
ZHENG Zong-sheng, LIU Zhao-rong, HUANG Dong-mei, SONG Wei, ZOU Guo-liang, HOU Qian, HAO Jian-bo
摘要: 针对特定任务中深度学习模型的激活函数不易选取的问题,在分析传统激活函数和现阶段运用比较广泛的激活函数的优缺点的基础上,将Tanh激活函数与广泛使用的ReLU激活函数相结合,构造了一种能够弥补Tanh函数和ReLU函数缺点的激活函数T-ReLU。通过构建台风等级分类的深度学习模型Typ-CNNs,将日本气象厅发布的台风卫星云图作为自建样本数据集,采用几种不同的激活函数进行对比实验,结果显示使用T-ReLU函数得到的台风等级分类的测试精度比使用ReLU激活函数的测试精度高出1.124%,比使用Tanh函数的测试精度高出2.102%;为了进一步验证结果的可靠性,采用MNIST通用数据集进行激活函数的对比实验,最终使用T-ReLU函数得到99.855%的训练精度和98.620%的测试精度,其优于其他激活函数的效果。
中图分类号:
[1]GUO L L,DING S F.Research Progress in Deep Learning [J].Computer Science,2015,42(5):28-33.(in Chinese) 郭丽丽,丁世飞.深度学习研究进展[J].计算机科学,2015,42(5):28-33. [2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444. [3]SCHULZ H,BEHNKE S.Deep Learning[J].KI -Künstliche Intelligenz,2012,26(4):357-363. [4]CIRSTEA B I,LIKFORMANSULEM L.Improving a deep convolutional neural network architecture for character recognition[J].Electronic Imaging,2016,2016(17):1-7. [5]PENG Q,JI G S,XIE L J,et al.Application of Convolution Neural Network in Vehicle Identification [J/OL].http://kns.cnki.net/kcms/detail/11.5602.TP.20170807.1008.002.html.(in Chinese) 彭清,季桂树,谢林江,等.卷积神经网络在车辆识别中的应用[J/OL].http://kns.cnki.net/kcms/detail/11.5602.TP.20170807.1008.002.html. [6]LI J C,YUANG C,SONG Y.Automatic Labeling of Multi-label Images Based on Convolutional Neural Network [J].Computer Science,2016,43(7):41-45.(in Chinese) 黎健成,袁春,宋友.基于卷积神经网络的多标签图像自动标注[J].计算机科学,2016,43(7):41-45. [7]LI H,LIU F,YANG S Y,et al.Remote sensing image fusion based on deep supportive value learning network [J].Acta Automatica Sinica,2016,39(8):1583-1596.(in Chinese) 李红,刘芳,杨淑媛,等.基于深度支撑值学习网络的遥感图像融合[J].计算机学报,2016,39(8):1583-1596. [8]SHAFIE A S,MOHTAR I A,MASROM S,et al.Backpropagation neural network with new improved error function and activation function for classification problem[C]∥Humanities,Scien-ce and Engineering Research.IEEE,2012:1359-1364. [9]GONG Z T,CHEN G X,CAO J S.Application of Convolutional Neural Network in Image Classification of Cerebrospinal Fluid [J].Computer Engineering and Design,2017,38 (4):1056-1061.(in Chinese) 龚震霆,陈光喜,曹建收.卷积神经网络在脑脊液图像分类上的应用[J].计算机工程与设计,2017,38(4):1056-1061. [10]WANG F F.Research and application of improved convolutionneural network algorithm [D].Nanjing:Nanjing University of Posts and Telecommunications,2016.(in Chinese) 王飞飞.基于改进卷积神经网络算法的研究与应用[D].南京:南京邮电大学,2016. [11]NAIR V,HINTON G E.Rectified linear units improve restric-ted boltzmann machines[C]∥International Conference on International Conference on Machine Learning.Omnipress,2010:807-814. [12]REHN M,SOMMER F T.A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields[J].Journal of Computational Neuroscience,2007,22(2):135-146. [13]LENNIE P.Supplemental Data The Cost of Cortical Computation[J].Current Biology,2003,13(6):493-497. [14]HUANG Y,DUAN X S,SUN S Y,et al.Research on training algorithm of deep neural networks based on improved sigmoid activation function [J].Computer Measurement and Control,2017,25(2):126-129.(in Chinese) 黄毅,段修生,孙世宇,等.基于改进sigmoid激活函数的深度神经网络训练算法研究[J].计算机测量与控制,2017,25(2):126-129. [15]GLOROT X,BORDES A,BENGIO Y.Deep sparse rectifierneural networks∥International Conference on Artificial Intelligence and Statistics.2012:315-323. [16]JARRETT K,KAVUKCUOGLU K,RANZATO M,et al.What is the Best Multi-Stage Architecture for Object Recognition?[C]∥IEEE International Conference on Computer Vision.2009:2146-2153. [17]OLSHAUSEN B A,FIELD D J.Sparse coding with an overcomplete basis set:a strategy employed by V1[J].Vision Research,1997,37(23):3311. [18]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105. [19]MAAS A L,QI P,HANNUN A Y,et al.Building DNN acoustic models for large vocabulary speech recognition[J].Computer Speech & Language,2017,41(C):195-213. [20]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2015:1-9. [21]HE K,ZHANG X,REN S,et al.Delving Deep into Rectifiers:Surpassing Human-Level Performance on ImageNet Classification∥2015 IEEE International Conference on Computer Vision (ICCV).EEE,2015. [22]XU B,WANG N,CHEN T,et al.Empirical Evaluation of Rectified Activations in Convolutional Network.https://arxiv.org/abs/1505.00853. [23]POGGIO T,GIROSI F.Networks for approximation and lear-ning[J].Proceedings of the IEEE,1990,78(9):1481-1497. [24]SU H,LI G,YU D,et al.Error back propagation for sequence training of Context-Dependent Deep NetworkS for conversatio-nal speech transcription[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:6664-6668. [25]VEDALDI A,LENC K.MatConvNet:Convolutional NeuralNetworks for MATLAB[C]∥ACM International Conference on Multimedia.ACM,2015:689-692. |
[1] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[2] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[3] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[4] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[5] | 陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121 |
[6] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[7] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[8] | 檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064 |
[9] | 李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023 |
[10] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[11] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[12] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[13] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
[14] | 周慧, 施皓晨, 屠要峰, 黄圣君. 基于主动采样的深度鲁棒神经网络学习 Robust Deep Neural Network Learning Based on Active Sampling 计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044 |
[15] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
|