计算机科学 ›› 2022, Vol. 49 ›› Issue (6): 224-230.doi: 10.11896/jsjkx.210400087
武霖, 孙静宇
WU Lin, SUN Jing-yu
摘要: 胶囊网络是一种新型深度神经网络,采用向量表达图像特征信息,并通过引入动态路由算法解决了卷积神经网络的两个主要问题:1)无法对图像的部分与整体关系进行学习和表达;2)池化操作导致图像特征信息严重丢失。然而,CapsNet需要学习图像的所有特征,当图像背景较复杂时,其存在提取图像特征信息不足、训练参数量大和训练效率低等问题。为此,首先设计了一种轻量级的图像特征提取器RA模块,用于更快、更完整地提取图像特征信息;其次,设计了两种不同深度的轻量化分支来提升网络的训练效率;最后,设计了新的压缩函数hc-squash来确保网络能够获取更多有用信息,并提出了多分支RA胶囊网络。通过在MNIST,Fashion-MNIST,affNIST和CIFAR-10这4个图像分类数据集中的应用,证实了多分支RA胶囊网络在多项性能指标上优于CapsNet和MLCN,并针对所提网络设计了改进方案,以优化分类性能。
中图分类号:
[1] JIANG J,LIU F,XU Y,et al.Multi-spectral RGB-NIR imageclassification using double-channel CNN[J].IEEE Access,2019,7:20607-20613. [2] BAE S H.Object detection based on region decomposition and assembly[C]//Proceedings of the AAAI Conference on Artificial Intelligence.2019,33(1):8094-8101. [3] FAHIM RAHMAN A K M,RAIHAN M R,MOHIDUL ISLAM S M.Pedestrian Detection in Thermal Images Using Deep Saliency Map and Instance Segmentation[J].International Journal of Image,Graphics and Signal Processing(IJIGSP),2021,13(1):40-49. [4] SABOUR S,FROSST N,HINTON G E.Dynamic Routing Between Capsules[C]//Advances in Neural Information Proces-sing Systems.2017:3856-3866. [5] HINTON G E,SABOUR S,FROSST N.Matrix capsules with EM routing[C]//International Conference on Learning Representations.2018. [6] XIANG C,ZHANG L,TANG Y,et al.MS-CapsNet:A novel multi-scale capsule network[J].IEEE Signal Processing Letters,2018,25(12):1850-1854. [7] NGUYEN H H,YAMAGISHI J,ECHIZEN I.Capsule-forensics:Using capsule networks to detect forged images and videos[C]//2019 IEEE International Conference on Acoustics,Speech and Signal Processing (ICASSP 2019).IEEE,2019:2307-2311. [8] DO ROSARIO V M,BORIN E,JBRETER-NITZ M.The multi-lane capsule network[J].IEEE Signal Processing letters,2019,26(7):1006-1010. [9] XIONG Y,SU G,YE S,et al.Deeper capsule network for complex data[C]//2019 International Joint Conference on Neural Networks (IJCNN).IEEE,2019:1-8. [10] HAN T,SUN R,SHAO F,et al.Feature and spatial relation-ship coding capsule network[J/OL].Journal of Electronic Imaging.https://doi.org/10.1117/1.JEI.29.2.023004. [11] CHANG S,LIU J.Multi-lane Capsule Network for classifying images with complex background[J].IEEE Access,2020,8:79876-79886. [12] HOCHREITER S,SCHMIDHUBER J.LSTM can solve hardlong time lag problems[C]//Advances in Neural Information Processing Systems.1997:473-479. [13] SRIVASTAVA R K,SCHMIDHUBER J,GREFF K.Highway Networks[J].arXiv:1505.00387,2015. [14] HINTON G E,KRIZHEVSKY A,WANG S D.Transformingauto-encoders[C]//International Conference on Artificial Neural Networks.Berlin:Springer,2011:44-51. [15] HU J,SHEN L,SUN G.Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:7132-7141. [16] HE K,ZHANG X,REN S,et al.Deep Residual Learning for Im-age Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:770-778. [17] YANG Z,WANG X.Reducing the Dilution:analysis of the information sensitiveness of capsule network and one practical solution[J].arXiv:1903.10588v3,2019. |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[3] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[4] | 徐涌鑫, 赵俊峰, 王亚沙, 谢冰, 杨恺. 时序知识图谱表示学习 Temporal Knowledge Graph Representation Learning 计算机科学, 2022, 49(9): 162-171. https://doi.org/10.11896/jsjkx.220500204 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[7] | 汤凌韬, 王迪, 张鲁飞, 刘盛云. 基于安全多方计算和差分隐私的联邦学习方案 Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy 计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108 |
[8] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[9] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[10] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[11] | 王剑, 彭雨琦, 赵宇斐, 杨健. 基于深度学习的社交网络舆情信息抽取方法综述 Survey of Social Network Public Opinion Information Extraction Based on Deep Learning 计算机科学, 2022, 49(8): 279-293. https://doi.org/10.11896/jsjkx.220300099 |
[12] | 郝志荣, 陈龙, 黄嘉成. 面向文本分类的类别区分式通用对抗攻击方法 Class Discriminative Universal Adversarial Attack for Text Classification 计算机科学, 2022, 49(8): 323-329. https://doi.org/10.11896/jsjkx.220200077 |
[13] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[14] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[15] | 侯钰涛, 阿布都克力木·阿布力孜, 哈里旦木·阿布都克里木. 中文预训练模型研究进展 Advances in Chinese Pre-training Models 计算机科学, 2022, 49(7): 148-163. https://doi.org/10.11896/jsjkx.211200018 |
|