计算机科学 ›› 2022, Vol. 49 ›› Issue (6): 224-230.doi: 10.11896/jsjkx.210400087

• 计算机图形学&多媒体 • 上一篇    下一篇

多分支RA胶囊网络及在图像分类中的应用

武霖, 孙静宇   

  1. 太原理工大学软件学院 太原 030024
  • 收稿日期:2021-04-08 修回日期:2021-09-03 出版日期:2022-06-15 发布日期:2022-06-08
  • 通讯作者: 孙静宇(whitesunpersun@163.com)
  • 作者简介:(2669281495@qq.com)

Multi-branch RA Capsule Network and Its Application in Image Classification

WU Lin, SUN Jing-yu   

  1. College of Software,Taiyuan University of Technology,Taiyuan 030024,China
  • Received:2021-04-08 Revised:2021-09-03 Online:2022-06-15 Published:2022-06-08
  • About author:WU Lin,born in 1995,postgraduate,is a member of China Computer Federation.His main research interests include image processing and recommendation system.
    SUN Jing-yu,born in 1975,Ph.D,asso-ciate professor,is a member of China Computer Federation.His main research interests include collaborative web search,recommendation system and smart city.

摘要: 胶囊网络是一种新型深度神经网络,采用向量表达图像特征信息,并通过引入动态路由算法解决了卷积神经网络的两个主要问题:1)无法对图像的部分与整体关系进行学习和表达;2)池化操作导致图像特征信息严重丢失。然而,CapsNet需要学习图像的所有特征,当图像背景较复杂时,其存在提取图像特征信息不足、训练参数量大和训练效率低等问题。为此,首先设计了一种轻量级的图像特征提取器RA模块,用于更快、更完整地提取图像特征信息;其次,设计了两种不同深度的轻量化分支来提升网络的训练效率;最后,设计了新的压缩函数hc-squash来确保网络能够获取更多有用信息,并提出了多分支RA胶囊网络。通过在MNIST,Fashion-MNIST,affNIST和CIFAR-10这4个图像分类数据集中的应用,证实了多分支RA胶囊网络在多项性能指标上优于CapsNet和MLCN,并针对所提网络设计了改进方案,以优化分类性能。

关键词: RA模块, 胶囊网络, 深度学习, 压缩函数, 注意力机制

Abstract: Capsule Network is a new type of deep neural network that uses vectors to express information of image feature and overcomes two major problems of convolutional neural networks by introducing dynamic routing algorithms.First,convolutional neural networks cannot learn and express the part-whole relationship of images.Second,pooling operations lead to serious loss of image feature information.However,CapsNet needs to learn all the features of the image,and when the image background is complex,it has the problems of insufficient information of extracted image features,large number of training parameters and low training efficiency.To this end,firstly,a lightweight image feature extractor RA module is designed to extract image feature information faster and more completely.Secondly,two different depths of lightweight branches are designed to improve the training efficiency of the network.Finally,a new compression function hc-squash is designed to ensure that the network can acquire more useful information,and a multi-branch RA (Resnet Attention) capsule network is proposed.Through the application in the four image classification datasets of MNIST,Fashion-MNIST,affNIST and CIFAR-10,it is confirmed that the multi-branch RA capsule network outperforms CapsNet and MLCN in several performance metrics,and an improvement scheme is designed for the proposed network to achieve optimised classification performance.

Key words: Attention mechanism, Capsule network, Deep learning, Resnet attention module, Squash function

中图分类号: 

  • TP391.41
[1] JIANG J,LIU F,XU Y,et al.Multi-spectral RGB-NIR imageclassification using double-channel CNN[J].IEEE Access,2019,7:20607-20613.
[2] BAE S H.Object detection based on region decomposition and assembly[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):8094-8101.
[3] FAHIM RAHMAN A K M,RAIHAN M R,MOHIDUL ISLAM S M.Pedestrian Detection in Thermal Images Using Deep Saliency Map and Instance Segmentation[J].International Journal of Image,Graphics and Signal Processing(IJIGSP),2021,13(1):40-49.
[4] SABOUR S,FROSST N,HINTON G E.Dynamic Routing Between Capsules[C]//Advances in Neural Information Proces-sing Systems.2017:3856-3866.
[5] HINTON G E,SABOUR S,FROSST N.Matrix capsules with EM routing[C]//International Conference on Learning Representations.2018.
[6] XIANG C,ZHANG L,TANG Y,et al.MS-CapsNet:A novel multi-scale capsule network[J].IEEE Signal Processing Letters,2018,25(12):1850-1854.
[7] NGUYEN H H,YAMAGISHI J,ECHIZEN I.Capsule-forensics:Using capsule networks to detect forged images and videos[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 2019).IEEE,2019:2307-2311.
[8] DO ROSARIO V M,BORIN E,JBRETER-NITZ M.The multi-lane capsule network[J].IEEE Signal Processing letters,2019,26(7):1006-1010.
[9] XIONG Y,SU G,YE S,et al.Deeper capsule network for complex data[C]//2019 International Joint Conference on Neural Networks (IJCNN).IEEE,2019:1-8.
[10] HAN T,SUN R,SHAO F,et al.Feature and spatial relation-ship coding capsule network[J/OL].Journal of Electronic Imaging.https://doi.org/10.1117/1.JEI.29.2.023004.
[11] CHANG S,LIU J.Multi-lane Capsule Network for classifying images with complex background[J].IEEE Access,2020,8:79876-79886.
[12] HOCHREITER S,SCHMIDHUBER J.LSTM can solve hardlong time lag problems[C]//Advances in Neural Information Processing Systems.1997:473-479.
[13] SRIVASTAVA R K,SCHMIDHUBER J,GREFF K.Highway Networks[J].arXiv:1505.00387,2015.
[14] HINTON G E,KRIZHEVSKY A,WANG S D.Transformingauto-encoders[C]//International Conference on Artificial Neural Networks.Berlin:Springer,2011:44-51.
[15] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141.
[16] HE K,ZHANG X,REN S,et al.Deep Residual Learning for Im-age Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:770-778.
[17] YANG Z,WANG X.Reducing the Dilution:analysis of the information sensitiveness of capsule network and one practical solution[J].arXiv:1903.10588v3,2019.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺.
时序知识图谱表示学习
Temporal Knowledge Graph Representation Learning
计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204
[5] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[6] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[7] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[8] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[9] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[10] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[11] 王剑, 彭雨琦, 赵宇斐, 杨健.
基于深度学习的社交网络舆情信息抽取方法综述
Survey of Social Network Public Opinion Information Extraction Based on Deep Learning
计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099
[12] 郝志荣, 陈龙, 黄嘉成.
面向文本分类的类别区分式通用对抗攻击方法
Class Discriminative Universal Adversarial Attack for Text Classification
计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077
[13] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[14] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[15] 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木.
中文预训练模型研究进展
Advances in Chinese Pre-training Models
计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!