Computer Science ›› 2021, Vol. 48 ›› Issue (1): 197-203.doi: 10.11896/jsjkx.191000135

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism

LIU Yang, JIN Zhong   

  1. School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
    Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education,Nanjing University of Science and Technology,Nanjing 210094,China
  • Received:2019-10-21 Revised:2020-04-05 Online:2021-01-15 Published:2021-01-15
  • About author:LIU Yang ,born in 1995,postgraduate.His main research interests include fine-grained image recognition and object detection.
    JIN Zhong,born in 1961,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include pattern recognition and face recognition.
  • Supported by:
    National Natural Science Foundation of China(61872188, U1713208).

Abstract: The goal of fine-grained image recognition is to classify object subclasses at a fine-grained level.Because the differences between different subclasses are very subtle,fine-grained image recognition is very challenging.At present,the difficulty of this kind of algorithm is how to locate the distinguishable parts of fine-grained targets and how to extract fine-grained features of fine-grained levels.To this end,a fine-grained recognition method combining Non-local and multi-regional attention mechanisms is proposed.Navigatoronly uses image labels to locate some discriminative regions,and achieves good classification results by fusing global features and discriminative regional features.However,Navigator is still flawed.Firstly,the navigator does not consider the relationship between different locations,so the algorithm proposed in this paper combines the non-local module with the navigator to enhance the global information perception ability of the model.Secondly,aiming at the defect that the Non-local module does not establish the relationship between feature channels,a feature extraction network based on channel attention mechanism is constructed,which makes the network pay more attention to the important feature channels.Finally,the algorithm proposed in this paper achieves recognition accuracy of 88.1%,94.3% and 91.8% on three open fine-grained image databases,CUB-200-2011,Stanford Cars and FGVC Aircraft respectively,and has a significant improvement over Navigator.

Key words: Attention mechanism, Feature extraction, Fine-grained image recognition, Non-local, Regional location

CLC Number: 

  • TP301.6
[1] BRANSON S,VAN HORN G,BELONGIE S,et al.Bird species categorization using pose normalized deep convolutional nets[J].arXiv:1406.2952,2014.
[2] CHAI Y,LEMPITSKY V,ZISSERMAN A.Symbiotic Segmentation and Part Localization for Fine-Grained Categotization[C]//IEEE International Conference on computer Computer Vision.2013:321-328.
[3] ZHANG N,DONAHUE J,GIRSHICK R,et al.Part-basedR-CNNs for fine-grained category detection[C]//European Conference on Computer Vision.Springer,Cham,2014:834-849.
[4] XIE L,TIAN Q,HONG R,et al.Hierarchical Part Matching for Fine-Grained Visual Categorization[C]//IEEE International Conference on Computer Vision.2014.
[5] YANG Z,LUO T,WANG D,et al.Learning to Navigate forFine-grained Classification[C]//European Conference on Computer Vision(ECCV).2018:420-435.
[6] ZHENG H,FU J,TAO M,et al.Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition[C]//IEEE International Conference on Computer Vision(ICCV).2017:5209-5217.
[7] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-classconstraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision.2018:805-821.
[8] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4148-4157.
[9] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2015,39(6):1137-1149.
[10] WANG X,GIRSHICK R,GUPTA A,et al.Non-local neuralnetworks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803.
[11] WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional blockattention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[12] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN Mo-dels for Fine-grained Visual Recognition[C]//IEEE international conference on computer vision.2015:1449-1457.
[13] HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[J].Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2016:770-778.
[14] SHU K,FOWLKES C.Low-Rank Bilinear Pooling for Fine-Grained Classification[C]//IEEE Conference on Computer Vision & Pattern Recognition.2017.
[15] GAO Y,BEIJBOM O,ZHANG N,et al.Compact bilinear poo-ling[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2016:317-326.
[16] KIM J H,ON K W,LIM W,et al.Hadamard Product for Low-rank Bilinear Pooling[J].arXiv:1610.04325,2016.
[17] YU C,ZHAO X,ZHENG Q,et al.Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:574-589.
[18] ZHANG Y,TANG H,JIA K.Fine-grained visual categorization using meta-learning optimization with sample selection of auxi-liary data[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:233-248.
[19] JI Z,FU Y,GUO J,et al.Stacked semantics-guided atten-tion model for fine-grained zero-shot learning[C]//Advances in Neural Information Processing Systems.2018:5995-6004.
[20] HU J,SHEN L,ALBANIE S,et al.Squeeze-and-Excitation Networks[C]//IEEE conference on computer vision and pattern recognition.2018:7132-7141.
[21] LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2017:2117-2125.
[22] WAH C,BRANSON S,WELINDER P,et al.The caltech-ucsd birds-200-2011 dataset[EB/OL].https://www.doc88.com/p-1817605164799.html.
[23] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:554-561.
[24] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted Convolutional Neural Networks.[C]//BMVC.2016:21-24.
[25] FU J,ZHENG H,TAO M.Look Closer to See Better:Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:4438-4446.
[26] WANG F,JIANG M,QIAN C,et al.Residual attention network for image classification[C]//Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2017).2017:6450-6458.
[27] XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]//Proceedings- 30th IEEE Conference on Computer Vision and Pattern Recognition.2017:5987-5995.
[28] ZHANG X,LI Z,LOY C C,et al.PolyNet:A pursuit of structural diversity in very deep networks[C]//Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition.2017:3900-3908.
[29] WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional block attention module[C]//European Conference on Computer Vision(ECCV).2018:3-19.
[30] ZHENG H,FU J,TAO M,et al.Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition[C]//IEEE International Conference on Computer Vision(ICCV).2017:5209-5217.
[31] MAJI S,RAHTU E,KANNALA J,et al.Fine-grained visual classification of aircraft[J].arXiv:1306.5151,2013.
[32] PENG Y,HE X,ZHAO J.Object-part attention model for fine-grained image classification[J].IEEE Transactions on Image Processing,IEEE,2017,27(3):1487-1500.
[33] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-class constraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:805-821.
[34] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4148-4157
[1] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[5] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[6] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[7] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[8] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[9] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[10] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[11] ZHANG Yuan, KANG Le, GONG Zhao-hui, ZHANG Zhi-hong. Related Transaction Behavior Detection in Futures Market Based on Bi-LSTM [J]. Computer Science, 2022, 49(7): 31-39.
[12] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[13] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
[14] CHENG Cheng, JIANG Ai-lian. Real-time Semantic Segmentation Method Based on Multi-path Feature Extraction [J]. Computer Science, 2022, 49(7): 120-126.
[15] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!