计算机科学 ›› 2021, Vol. 48 ›› Issue (1): 197-203.doi: 10.11896/jsjkx.191000135
刘洋, 金忠
LIU Yang, JIN Zhong
摘要: 细粒度图像识别的目标是对细粒度级别的物体子类进行分类,由于不同子类间的差异非常细微,使得细粒度图像识别具有非常大的挑战性。目前细粒度图像识别算法的难度在于如何定位细粒度目标中具有分辨性的部位以及如何更好地提取细粒度级别的细微特征。为此,提出了一种结合非局部和多区域注意力机制的细粒度识别方法。Navigator只利用图像标签便可以较好地定位到一些鉴别性区域,通过融合全局特征以及鉴别性区域特征取得了不错的分类结果。然而,Navigator仍存在缺陷:1)Navigator未考虑不同位置间的联系,因此所提算法通过引入非局部模块与Navigator相结合,来加强模型的全局信息感知能力;2)针对非局部模块未建立特征通道间联系的缺陷,构建基于通道注意力机制的特征提取网络,使得网络关注更加重要的特征通道。最后,所提算法在3个公开的细粒度图像库CUB-200-2011,Stanford Cars 和FGVC Aircraft上分别达到了88.1%,94.3%,92.0%的识别精度,并且相比Navigator有明显的精度提升。
中图分类号:
[1] BRANSON S,VAN HORN G,BELONGIE S,et al.Bird species categorization using pose normalized deep convolutional nets[J].arXiv:1406.2952,2014. [2] CHAI Y,LEMPITSKY V,ZISSERMAN A.Symbiotic Segmentation and Part Localization for Fine-Grained Categotization[C]//IEEE International Conference on computer Computer Vision.2013:321-328. [3] ZHANG N,DONAHUE J,GIRSHICK R,et al.Part-basedR-CNNs for fine-grained category detection[C]//European Conference on Computer Vision.Springer,Cham,2014:834-849. [4] XIE L,TIAN Q,HONG R,et al.Hierarchical Part Matching for Fine-Grained Visual Categorization[C]//IEEE International Conference on Computer Vision.2014. [5] YANG Z,LUO T,WANG D,et al.Learning to Navigate forFine-grained Classification[C]//European Conference on Computer Vision(ECCV).2018:420-435. [6] ZHENG H,FU J,TAO M,et al.Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition[C]//IEEE International Conference on Computer Vision(ICCV).2017:5209-5217. [7] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-classconstraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision.2018:805-821. [8] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4148-4157. [9] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2015,39(6):1137-1149. [10] WANG X,GIRSHICK R,GUPTA A,et al.Non-local neuralnetworks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803. [11] WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional blockattention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19. [12] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN Mo-dels for Fine-grained Visual Recognition[C]//IEEE international conference on computer vision.2015:1449-1457. [13] HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[J].Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2016:770-778. [14] SHU K,FOWLKES C.Low-Rank Bilinear Pooling for Fine-Grained Classification[C]//IEEE Conference on Computer Vision & Pattern Recognition.2017. [15] GAO Y,BEIJBOM O,ZHANG N,et al.Compact bilinear poo-ling[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2016:317-326. [16] KIM J H,ON K W,LIM W,et al.Hadamard Product for Low-rank Bilinear Pooling[J].arXiv:1610.04325,2016. [17] YU C,ZHAO X,ZHENG Q,et al.Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:574-589. [18] ZHANG Y,TANG H,JIA K.Fine-grained visual categorization using meta-learning optimization with sample selection of auxi-liary data[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:233-248. [19] JI Z,FU Y,GUO J,et al.Stacked semantics-guided atten-tion model for fine-grained zero-shot learning[C]//Advances in Neural Information Processing Systems.2018:5995-6004. [20] HU J,SHEN L,ALBANIE S,et al.Squeeze-and-Excitation Networks[C]//IEEE conference on computer vision and pattern recognition.2018:7132-7141. [21] LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2017:2117-2125. [22] WAH C,BRANSON S,WELINDER P,et al.The caltech-ucsd birds-200-2011 dataset[EB/OL].https://www.doc88.com/p-1817605164799.html. [23] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:554-561. [24] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted Convolutional Neural Networks.[C]//BMVC.2016:21-24. [25] FU J,ZHENG H,TAO M.Look Closer to See Better:Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:4438-4446. [26] WANG F,JIANG M,QIAN C,et al.Residual attention network for image classification[C]//Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2017).2017:6450-6458. [27] XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]//Proceedings- 30th IEEE Conference on Computer Vision and Pattern Recognition.2017:5987-5995. [28] ZHANG X,LI Z,LOY C C,et al.PolyNet:A pursuit of structural diversity in very deep networks[C]//Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition.2017:3900-3908. [29] WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional block attention module[C]//European Conference on Computer Vision(ECCV).2018:3-19. [30] ZHENG H,FU J,TAO M,et al.Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition[C]//IEEE International Conference on Computer Vision(ICCV).2017:5209-5217. [31] MAJI S,RAHTU E,KANNALA J,et al.Fine-grained visual classification of aircraft[J].arXiv:1306.5151,2013. [32] PENG Y,HE X,ZHAO J.Object-part attention model for fine-grained image classification[J].IEEE Transactions on Image Processing,IEEE,2017,27(3):1487-1500. [33] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-class constraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:805-821. [34] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4148-4157 |
[1] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[2] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[3] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[4] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[5] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[6] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[7] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[8] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[9] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[10] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[11] | 张源, 康乐, 宫朝辉, 张志鸿. 基于Bi-LSTM的期货市场关联交易行为检测方法 Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM 计算机科学, 2022, 49(7): 31-39. https://doi.org/10.11896/jsjkx.210400304 |
[12] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[13] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
[14] | 程成, 降爱莲. 基于多路径特征提取的实时语义分割方法 Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction 计算机科学, 2022, 49(7): 120-126. https://doi.org/10.11896/jsjkx.210500157 |
[15] | 徐鸣珂, 张帆. Head Fusion:一种提高语音情绪识别的准确性和鲁棒性的方法 Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition 计算机科学, 2022, 49(7): 132-141. https://doi.org/10.11896/jsjkx.210100085 |
|