多尺度膨胀卷积在图像分类中的应用

doi:10.11896/JsJkx.190600179

摘要/Abstract

摘要： 在采用深度学习进行图像分类时,为减少下采样导致的空间信息损失,往往采用膨胀卷积代替下采样,但尚未有文献研究膨胀卷积作用于不同网络层的性能差异。文中进行了大量图像分类实验,找到了适宜膨胀卷积作用的最佳网络层。但使用膨胀卷积会丢失近邻点的相关信息,导致网格现象,造成图像部分局部信息的丢失。为消除网格现象,又提出在前述最佳网络层采用多尺度膨胀卷积构建神经网络的方法。实验结果表明,所提出的构建网络方法在图像分类中取得了较好的效果。

关键词: 多尺度, 膨胀卷积, 神经网络, 图像分类

Abstract: In order to reduce the loss of spatial information caused by down sampling,dilated convolution is often used instead of down-sampling in image classification based on deep learning.However,there is no literature on the performance difference of dilated convolution on different network layers.In this paper,a large number of image classification experiments have been carried out,and the best network layer suitable for dilated convolution has been found.However,the use of dilated convolution will lose the information of neighboring points,resulting in grid phenomenon and the loss of partial information of the image.In order to eliminate the grid phenomenon,this paper also proposes a method of constructing neural network by using multi-scale dilated convolution in the optimal network layer mentioned above.The experimental results show that the proposed network construction method achieves good results in image classification.

Key words: Dilated convolution, Image classification, Multi-scale, Neural network

中图分类号:

TP301

吴昊昊, 王方石. 多尺度膨胀卷积在图像分类中的应用[J]. 计算机科学, 2020, 47(6A): 166-171. https://doi.org/10.11896/JsJkx.190600179

WU Hao-hao and WANG Fang-shi. Application of Multi-scale Dilated Convolution in Image Classification[J]. Computer Science, 2020, 47(6A): 166-171. https://doi.org/10.11896/JsJkx.190600179

参考文献

[1] KRIZHEVSKY A,SUTSKEVER I,HINTON G E.Imagenet classification with deep convolutional neural networks//Advances in Neural Information Processing Systems.2012:1097-1105.
[2] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition.arXiv:1409.1556,2014.
[3] YU F,KOLTUN V.Multi-scale context aggregation by dilated convolutions.arXiv:1511.07122,2015.
[4] LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2015:3431-3440.
[5] HUANG G,LIU Z,VAN DER MAATEN L,et al.Densely connected convolutional networks//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017:2261-2269.
[6] HE K,ZHANG X,REN S,et al.Deep residual learning for image recognition//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[7] YU F,KOLTUN V,FUNKHOUSER T.Dilated residual networks//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:472-480.
[8] CHEN L C,PAPANDREOU G,KOKKINOS I,et al.Deeplab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected crfs.IEEE Transactions on Pattern Analysis and Machine Intelligence,2018,40(4):834-848.
[9] WANG P,CHEN P,YUAN Y,et al.Understanding convolution for semantic segmentation//2018 IEEE Winter Conference on Applications of Computer Vision (WACV).IEEE,2018:1451-1460.
[10] ZAGORUYKO S,KOMODAKIS N.Wide residual networks .arXiv:1605.07146,2016.
[11] KRIZHEVSKY A,HINTON G.Learning multiple layers of features from tiny images.University of Toronto,2009.
[12] COATES A,NG A,LEE H.An analysis of single-layer networks in unsupervised feature learning//Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics.2011:215-223.
[13] SWERSKY K,SNOEK J,ADAMS R P.Multi-task bayesian optimization//Advances in Neural Information Processing Systems.2013:2004-2012.
[14] DUNDAR A,JIN J,CULURCIELLO E.Convolutional Clustering for Unsupervised Learning.Computer Science,2015:1143-1151.
[15] DOSOVITSKIYA,FISCHER P,SPRINGENBERG J T,et al.Discriminative Unsupervised Feature Learning with Exemplar Convolutional Neural Networks.IEEE Transactions on Pattern Analysis & Machine Intelligence,2014,38(9):1734-1747.
[16] ZHAO J,MATHIEU M,GOROSHIN R,et al.Stacked what-where auto-encoders.arXiv:1506.02351,2015.
[17] DEVRIES T,TAYLOR G W.Improved regularization of convolutional neural networks with cutout.arXiv:1708.04552,2017.

相关文章 15

[1]	宁晗阳, 马苗, 杨波, 刘士昌. 密码学智能化研究进展与分析 Research Progress and Analysis on Intelligent Cryptology 计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[2]	周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[3]	周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4]	李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩. 基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究 Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network 计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[5]	李宗民, 张玉鹏, 刘玉杰, 李华. 基于可变形图卷积的点云表征学习 Deformable Graph Convolutional Networks Based Point Cloud Representation Learning 计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[6]	王馨彤, 王璇, 孙知信. 基于多尺度记忆残差网络的网络流量异常检测模型 Network Traffic Anomaly Detection Method Based on Multi-scale Memory Residual Network 计算机科学, 2022, 49(8): 314-322. https://doi.org/10.11896/jsjkx.220200011
[7]	郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[8]	王润安, 邹兆年. 基于物理操作级模型的查询执行时间预测方法 Query Performance Prediction Based on Physical Operation-level Models 计算机科学, 2022, 49(8): 49-55. https://doi.org/10.11896/jsjkx.210700074
[9]	陈泳全, 姜瑛. 基于卷积神经网络的APP用户行为分析方法 Analysis Method of APP User Behavior Based on Convolutional Neural Network 计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[10]	朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[11]	魏恺轩, 付莹. 基于重参数化多尺度融合网络的高效极暗光原始图像降噪 Re-parameterized Multi-scale Fusion Network for Efficient Extreme Low-light Raw Denoising 计算机科学, 2022, 49(8): 120-126. https://doi.org/10.11896/jsjkx.220200179
[12]	檀莹莹, 王俊丽, 张超波. 基于图卷积神经网络的文本分类方法研究综述 Review of Text Classification Methods Based on Graph Convolutional Network 计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[13]	闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[14]	武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航. 监督和半监督学习下的多标签分类综述 Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning 计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[15]	金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed