计算机科学 ›› 2018, Vol. 45 ›› Issue (12): 177-181.doi: 10.11896/j.issn.1002-137X.2018.12.028

• 人工智能 • 上一篇    下一篇

基于改进激活函数的用于台风等级分类的深度学习模型

郑宗生, 刘兆荣, 黄冬梅, 宋巍, 邹国良, 侯倩, 郝剑波   

  1. (上海海洋大学信息学院 上海201306)
  • 收稿日期:2017-11-23 出版日期:2018-12-15 发布日期:2019-02-25
  • 作者简介:郑宗生(1979-),男,博士,副教授,主要研究方向为海洋信息化、深度学习应用;刘兆荣(1992-),女,硕士生,主要研究方向为深度学习应用,E-mail:2415932685@qq.com(通信作者);黄冬梅(1964-),女,博士,教授,主要研究方向为计算机应用研究;宋 巍(1977-),女,博士,教授,主要研究方向为移动多媒体、人机交互、海洋遥感图像分析、海洋大数据认知和理解等;邹国良(1961-),男,博士,教授,主要研究方向为海洋信息处理及应用;侯 倩(1992-),女,硕士生,主要研究方向为深度学习;郝剑波(1992-),男,硕士生,主要研究方向为海洋信息化,E-mail:lzr.liya2018@outlook.com。
  • 基金资助:
    本文受国家自然科学基金项目:基于多模态深度学习的弱特征多源海洋遥感影像协同分类模型研究(41671431),上海市科委地方院校能力建设项目:基于海洋视频时空交叉分析的近岸灾害性海浪预测研究及其应用(17050501900)资助。

Deep Learning Model for Typhon Grade Classification Based on Improved Activation Function

ZHENG Zong-sheng, LIU Zhao-rong, HUANG Dong-mei, SONG Wei, ZOU Guo-liang, HOU Qian, HAO Jian-bo   

  1. (College of Information Technology,Shanghai Ocean University,Shanghai 201306,China)
  • Received:2017-11-23 Online:2018-12-15 Published:2019-02-25

摘要: 针对特定任务中深度学习模型的激活函数不易选取的问题,在分析传统激活函数和现阶段运用比较广泛的激活函数的优缺点的基础上,将Tanh激活函数与广泛使用的ReLU激活函数相结合,构造了一种能够弥补Tanh函数和ReLU函数缺点的激活函数T-ReLU。通过构建台风等级分类的深度学习模型Typ-CNNs,将日本气象厅发布的台风卫星云图作为自建样本数据集,采用几种不同的激活函数进行对比实验,结果显示使用T-ReLU函数得到的台风等级分类的测试精度比使用ReLU激活函数的测试精度高出1.124%,比使用Tanh函数的测试精度高出2.102%;为了进一步验证结果的可靠性,采用MNIST通用数据集进行激活函数的对比实验,最终使用T-ReLU函数得到99.855%的训练精度和98.620%的测试精度,其优于其他激活函数的效果。

关键词: MNIST数据集, 激活函数, 卷积神经网络, 深度学习, 台风等级

Abstract: Aiming at the issue that it is difficult to select the activation function in deep learning model for specific task,on the basis of analyzing the advantages and disadvantages of traditional activation function and the popular activation function at the present stage,this paper constructed an activation function T-ReLU which can make up for the shortcomings of Tanh function and ReLU function by combining the Tanh activation function with the widely used ReLU function.By constructing the deep learning model Typ-CNNs for typhoon grade classification,using the Typhoon satellite image published by the Japan Meteorological Agency as the self-built sample data,this paper made use of several different activation functions to conduct comparison experiments.The results show that the test accuracy of typhoon grade classification using the T-ReLU function is 1.124% higher than that of using ReLU activation function,which is 2.102% higher than that of using Tanh function.In order to further verify the reliability of the results,the MNIST general data set was utilized to carry out the comparison experiment of activation function.The final results show that 99.855% training accuracy and 98.620% test accuracy can be obtained by using T-ReLU function,and it performs better than other activation functions.

Key words: Activation function, Convolution neural network, Deep learning, MNIST dataset, Typhoon grade

中图分类号: 

  • TP391
[1]GUO L L,DING S F.Research Progress in Deep Learning [J].Computer Science,2015,42(5):28-33.(in Chinese)
郭丽丽,丁世飞.深度学习研究进展[J].计算机科学,2015,42(5):28-33.
[2]LECUN Y,BENGIO Y,HINTON G.Deep learning[J].Nature,2015,521(7553):436-444.
[3]SCHULZ H,BEHNKE S.Deep Learning[J].KI -Künstliche Intelligenz,2012,26(4):357-363.
[4]CIRSTEA B I,LIKFORMANSULEM L.Improving a deep convolutional neural network architecture for character recognition[J].Electronic Imaging,2016,2016(17):1-7.
[5]PENG Q,JI G S,XIE L J,et al.Application of Convolution Neural Network in Vehicle Identification [J/OL].http://kns.cnki.net/kcms/detail/11.5602.TP.20170807.1008.002.html.(in Chinese)
彭清,季桂树,谢林江,等.卷积神经网络在车辆识别中的应用[J/OL].http://kns.cnki.net/kcms/detail/11.5602.TP.20170807.1008.002.html.
[6]LI J C,YUANG C,SONG Y.Automatic Labeling of Multi-label Images Based on Convolutional Neural Network [J].Computer Science,2016,43(7):41-45.(in Chinese)
黎健成,袁春,宋友.基于卷积神经网络的多标签图像自动标注[J].计算机科学,2016,43(7):41-45.
[7]LI H,LIU F,YANG S Y,et al.Remote sensing image fusion based on deep supportive value learning network [J].Acta Automatica Sinica,2016,39(8):1583-1596.(in Chinese)
李红,刘芳,杨淑媛,等.基于深度支撑值学习网络的遥感图像融合[J].计算机学报,2016,39(8):1583-1596.
[8]SHAFIE A S,MOHTAR I A,MASROM S,et al.Backpropagation neural network with new improved error function and activation function for classification problem[C]∥Humanities,Scien-ce and Engineering Research.IEEE,2012:1359-1364.
[9]GONG Z T,CHEN G X,CAO J S.Application of Convolutional Neural Network in Image Classification of Cerebrospinal Fluid [J].Computer Engineering and Design,2017,38 (4):1056-1061.(in Chinese)
龚震霆,陈光喜,曹建收.卷积神经网络在脑脊液图像分类上的应用[J].计算机工程与设计,2017,38(4):1056-1061.
[10]WANG F F.Research and application of improved convolutionneural network algorithm [D].Nanjing:Nanjing University of Posts and Telecommunications,2016.(in Chinese)
王飞飞.基于改进卷积神经网络算法的研究与应用[D].南京:南京邮电大学,2016.
[11]NAIR V,HINTON G E.Rectified linear units improve restric-ted boltzmann machines[C]∥International Conference on International Conference on Machine Learning.Omnipress,2010:807-814.
[12]REHN M,SOMMER F T.A network that uses few active neurones to code visual input predicts the diverse shapes of cortical receptive fields[J].Journal of Computational Neuroscience,2007,22(2):135-146.
[13]LENNIE P.Supplemental Data The Cost of Cortical Computation[J].Current Biology,2003,13(6):493-497.
[14]HUANG Y,DUAN X S,SUN S Y,et al.Research on training algorithm of deep neural networks based on improved sigmoid activation function [J].Computer Measurement and Control,2017,25(2):126-129.(in Chinese)
黄毅,段修生,孙世宇,等.基于改进sigmoid激活函数的深度神经网络训练算法研究[J].计算机测量与控制,2017,25(2):126-129.
[15]GLOROT X,BORDES A,BENGIO Y.Deep sparse rectifierneural networks∥International Conference on Artificial Intelligence and Statistics.2012:315-323.
[16]JARRETT K,KAVUKCUOGLU K,RANZATO M,et al.What is the Best Multi-Stage Architecture for Object Recognition?[C]∥IEEE International Conference on Computer Vision.2009:2146-2153.
[17]OLSHAUSEN B A,FIELD D J.Sparse coding with an overcomplete basis set:a strategy employed by V1[J].Vision Research,1997,37(23):3311.
[18]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet classification with deep convolutional neural networks∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105.
[19]MAAS A L,QI P,HANNUN A Y,et al.Building DNN acoustic models for large vocabulary speech recognition[J].Computer Speech & Language,2017,41(C):195-213.
[20]SZEGEDY C,LIU W,JIA Y,et al.Going deeper with convolutions∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2015:1-9.
[21]HE K,ZHANG X,REN S,et al.Delving Deep into Rectifiers:Surpassing Human-Level Performance on ImageNet Classification∥2015 IEEE International Conference on Computer Vision (ICCV).EEE,2015.
[22]XU B,WANG N,CHEN T,et al.Empirical Evaluation of Rectified Activations in Convolutional Network.https://arxiv.org/abs/1505.00853.
[23]POGGIO T,GIROSI F.Networks for approximation and lear-ning[J].Proceedings of the IEEE,1990,78(9):1481-1497.
[24]SU H,LI G,YU D,et al.Error back propagation for sequence training of Context-Dependent Deep NetworkS for conversatio-nal speech transcription[C]∥IEEE International Conference on Acoustics,Speech and Signal Processing.IEEE,2013:6664-6668.
[25]VEDALDI A,LENC K.MatConvNet:Convolutional NeuralNetworks for MATLAB[C]∥ACM International Conference on Multimedia.ACM,2015:689-692.
[1] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[2] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[3] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[4] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[5] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[6] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[7] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[8] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[9] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[10] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[11] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[12] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[13] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
[14] 周慧, 施皓晨, 屠要峰, 黄圣君.
基于主动采样的深度鲁棒神经网络学习
Robust Deep Neural Network Learning Based on Active Sampling
计算机科学, 2022, 49(7): 164-169. https://doi.org/10.11896/jsjkx.210600044
[15] 金方焱, 王秀利.
融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取
Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM
计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!