Computer Science ›› 2021, Vol. 48 ›› Issue (1): 197-203.doi: 10.11896/jsjkx.191000135

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism

LIU Yang, JIN Zhong   

  1. School of Computer Science and Engineering,Nanjing University of Science and Technology,Nanjing 210094,China
    Key Laboratory of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education,Nanjing University of Science and Technology,Nanjing 210094,China
  • Received:2019-10-21 Revised:2020-04-05 Online:2021-01-15 Published:2021-01-15
  • About author:LIU Yang ,born in 1995,postgraduate.His main research interests include fine-grained image recognition and object detection.
    JIN Zhong,born in 1961,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include pattern recognition and face recognition.
  • Supported by:
    National Natural Science Foundation of China(61872188, U1713208).

Abstract: The goal of fine-grained image recognition is to classify object subclasses at a fine-grained level.Because the differences between different subclasses are very subtle,fine-grained image recognition is very challenging.At present,the difficulty of this kind of algorithm is how to locate the distinguishable parts of fine-grained targets and how to extract fine-grained features of fine-grained levels.To this end,a fine-grained recognition method combining Non-local and multi-regional attention mechanisms is proposed.Navigatoronly uses image labels to locate some discriminative regions,and achieves good classification results by fusing global features and discriminative regional features.However,Navigator is still flawed.Firstly,the navigator does not consider the relationship between different locations,so the algorithm proposed in this paper combines the non-local module with the navigator to enhance the global information perception ability of the model.Secondly,aiming at the defect that the Non-local module does not establish the relationship between feature channels,a feature extraction network based on channel attention mechanism is constructed,which makes the network pay more attention to the important feature channels.Finally,the algorithm proposed in this paper achieves recognition accuracy of 88.1%,94.3% and 91.8% on three open fine-grained image databases,CUB-200-2011,Stanford Cars and FGVC Aircraft respectively,and has a significant improvement over Navigator.

Key words: Fine-grained image recognition, Attention mechanism, Non-local, Regional location, Feature extraction

CLC Number: 

  • TP301.6
[1] BRANSON S,VAN HORN G,BELONGIE S,et al.Bird species categorization using pose normalized deep convolutional nets[J].arXiv:1406.2952,2014.
[2] CHAI Y,LEMPITSKY V,ZISSERMAN A.Symbiotic Segmentation and Part Localization for Fine-Grained Categotization[C]//IEEE International Conference on computer Computer Vision.2013:321-328.
[3] ZHANG N,DONAHUE J,GIRSHICK R,et al.Part-basedR-CNNs for fine-grained category detection[C]//European Conference on Computer Vision.Springer,Cham,2014:834-849.
[4] XIE L,TIAN Q,HONG R,et al.Hierarchical Part Matching for Fine-Grained Visual Categorization[C]//IEEE International Conference on Computer Vision.2014.
[5] YANG Z,LUO T,WANG D,et al.Learning to Navigate forFine-grained Classification[C]//European Conference on Computer Vision(ECCV).2018:420-435.
[6] ZHENG H,FU J,TAO M,et al.Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition[C]//IEEE International Conference on Computer Vision(ICCV).2017:5209-5217.
[7] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-classconstraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision.2018:805-821.
[8] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4148-4157.
[9] REN S,HE K,GIRSHICK R,et al.Faster R-CNN:TowardsReal-Time Object Detection with Region Proposal Networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2015,39(6):1137-1149.
[10] WANG X,GIRSHICK R,GUPTA A,et al.Non-local neuralnetworks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803.
[11] WOO S,PARK J,LEE J Y,et al.Cbam:Convolutional blockattention module[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:3-19.
[12] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNN Mo-dels for Fine-grained Visual Recognition[C]//IEEE international conference on computer vision.2015:1449-1457.
[13] HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[J].Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2016:770-778.
[14] SHU K,FOWLKES C.Low-Rank Bilinear Pooling for Fine-Grained Classification[C]//IEEE Conference on Computer Vision & Pattern Recognition.2017.
[15] GAO Y,BEIJBOM O,ZHANG N,et al.Compact bilinear poo-ling[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2016:317-326.
[16] KIM J H,ON K W,LIM W,et al.Hadamard Product for Low-rank Bilinear Pooling[J].arXiv:1610.04325,2016.
[17] YU C,ZHAO X,ZHENG Q,et al.Hierarchical bilinear pooling for fine-grained visual recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:574-589.
[18] ZHANG Y,TANG H,JIA K.Fine-grained visual categorization using meta-learning optimization with sample selection of auxi-liary data[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:233-248.
[19] JI Z,FU Y,GUO J,et al.Stacked semantics-guided atten-tion model for fine-grained zero-shot learning[C]//Advances in Neural Information Processing Systems.2018:5995-6004.
[20] HU J,SHEN L,ALBANIE S,et al.Squeeze-and-Excitation Networks[C]//IEEE conference on computer vision and pattern recognition.2018:7132-7141.
[21] LIN T Y,DOLLÁR P,GIRSHICK R,et al.Feature pyramidnetworks for object detection[C]//Proceedings of the IEEE conference on computer vision and pattern recognition.2017:2117-2125.
[22] WAH C,BRANSON S,WELINDER P,et al.The caltech-ucsd birds-200-2011 dataset[EB/OL].https://www.doc88.com/p-1817605164799.html.
[23] KRAUSE J,STARK M,DENG J,et al.3D object representations for fine-grained categorization[C]//Proceedings of the IEEE International Conference on Computer Vision.2013:554-561.
[24] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted Convolutional Neural Networks.[C]//BMVC.2016:21-24.
[25] FU J,ZHENG H,TAO M.Look Closer to See Better:Recurrent Attention Convolutional Neural Network for Fine-grained Image Recognition[C]//IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2017:4438-4446.
[26] WANG F,JIANG M,QIAN C,et al.Residual attention network for image classification[C]//Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition(CVPR 2017).2017:6450-6458.
[27] XIE S,GIRSHICK R,DOLLÁR P,et al.Aggregated residual transformations for deep neural networks[C]//Proceedings- 30th IEEE Conference on Computer Vision and Pattern Recognition.2017:5987-5995.
[28] ZHANG X,LI Z,LOY C C,et al.PolyNet:A pursuit of structural diversity in very deep networks[C]//Proceedings 30th IEEE Conference on Computer Vision and Pattern Recognition.2017:3900-3908.
[29] WOO S,PARK J,LEE J Y,et al.CBAM:Convolutional block attention module[C]//European Conference on Computer Vision(ECCV).2018:3-19.
[30] ZHENG H,FU J,TAO M,et al.Learning Multi-attention Convolutional Neural Network for Fine-Grained Image Recognition[C]//IEEE International Conference on Computer Vision(ICCV).2017:5209-5217.
[31] MAJI S,RAHTU E,KANNALA J,et al.Fine-grained visual classification of aircraft[J].arXiv:1306.5151,2013.
[32] PENG Y,HE X,ZHAO J.Object-part attention model for fine-grained image classification[J].IEEE Transactions on Image Processing,IEEE,2017,27(3):1487-1500.
[33] SUN M,YUAN Y,ZHOU F,et al.Multi-attention multi-class constraint for fine-grained image recognition[C]//Proceedings of the European Conference on Computer Vision(ECCV).2018:805-821.
[34] WANG Y,MORARIU V I,DAVIS L S.Learning a discriminative filter bank within a CNN for fine-grained recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2018:4148-4157
[1] ZHAO Jia-qi, WANG Han-zheng, ZHOU Yong, ZHANG Di, ZHOU Zi-yuan. Remote Sensing Image Description Generation Method Based on Attention and Multi-scale Feature Enhancement [J]. Computer Science, 2021, 48(1): 190-196.
[2] WANG Rui-ping, JIA Zhen, LIU Chang, CHEN Ze-wei, LI Tian-rui. Deep Interest Factorization Machine Network Based on DeepFM [J]. Computer Science, 2021, 48(1): 226-232.
[3] WANG Run-zheng, GAO Jian, HUANG Shu-hua, TONG Xin. Malicious Code Family Detection Method Based on Knowledge Distillation [J]. Computer Science, 2021, 48(1): 280-286.
[4] PAN Zu-jiang, LIU Ning, ZHANG Wei, WANG Jian-yong. MTHAM:Multitask Disease Progression Modeling Based on Hierarchical Attention Mechanism [J]. Computer Science, 2020, 47(9): 185-189.
[5] BAO Yu-xuan, LU Tian-liang, DU Yan-hui. Overview of Deepfake Video Detection Technology [J]. Computer Science, 2020, 47(9): 283-292.
[6] WANG Liang, ZHOU Xin-zhi, YNA Hua. Real-time SIFT Algorithm Based on GPU [J]. Computer Science, 2020, 47(8): 105-111.
[7] ZHAO Wei, LIN Yu-ming, WANG Chao-qiang, CAI Guo-yong. Opinion Word-pairs Collaborative Extraction Based on Dependency Relation Analysis [J]. Computer Science, 2020, 47(8): 164-170.
[8] LIANG Zheng-you, HE Jing-lin, SUN Yu. Three-dimensional Convolutional Neural Network Evolution Method for Facial Micro-expression Auto-recognition [J]. Computer Science, 2020, 47(8): 227-232.
[9] YUAN Ye, HE Xiao-ge, ZHU Ding-kun, WANG Fu-lee, XIE Hao-ran, WANG Jun, WEI Ming-qiang, GUO Yan-wen. Survey of Visual Image Saliency Detection [J]. Computer Science, 2020, 47(7): 84-91.
[10] LIU Yan, WEN Jing. Complex Scene Text Detection Based on Attention Mechanism [J]. Computer Science, 2020, 47(7): 135-140.
[11] YANG Wei-chao, GUO Yuan-bo, LI Tao, ZHU Ben-quan. Method Based on Traffic Fingerprint for IoT Device Identification and IoT Security Model [J]. Computer Science, 2020, 47(7): 299-306.
[12] YU Yi-lin, TIAN Hong-tao, GAO Jian-wei and WAN Huai-yu. Relation Extraction Method Combining Encyclopedia Knowledge and Sentence Semantic Features [J]. Computer Science, 2020, 47(6A): 40-44.
[13] LAN Zhang-li, SHEN De-xing, CAO Juan and ZHANG Yu-xin. Content-independent Method for Basis Image Extraction and Image Reconstruction [J]. Computer Science, 2020, 47(6A): 226-229.
[14] ZHOU Li-peng, MENG Li-min, ZHOU Lei, JIANG Wei and DONG Jian-ping. Fall Detection Algorithm Based on BP Neural Network [J]. Computer Science, 2020, 47(6A): 242-246.
[15] YUAN De-yu, ZHANG Yi-fan, GAO Jian and SUN Hai-chun. Abnormal User Detection Method in Sina Weibo Based on User Feature Extraction [J]. Computer Science, 2020, 47(6A): 364-368.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[2] JIA Wei, HUA Qing-yi, ZHANG Min-jun, CHEN Rui, JI Xiang and WANG Bo. Mobile Interface Pattern Clustering Algorithm Based on Improved Particle Swarm Optimization[J]. Computer Science, 2018, 45(4): 220 -226 .
[3] DING Shu-yang, LI Bing and SHI Hong-bo. Study on Flexible Job-shop Scheduling Problem Based on Improved Discrete Particle Swarm Optimization Algorithm[J]. Computer Science, 2018, 45(4): 233 -239 .
[4] TONG Ze-ping, LI Tao, LI Li-jie and REN Liang. Study on Collaborative Optimization of Supply Chain with Uncertain Demand and Capacity Constraint[J]. Computer Science, 2018, 45(4): 260 -265 .
[5] SI Nian-wen, WANG Heng-jun, LI Wei, SHAN Yi-dong and XIE Peng-cheng. Chinese Part-of-speech Tagging Model Using Attention-based LSTM[J]. Computer Science, 2018, 45(4): 66 -70 .
[6] XIANG Ying-zhuo, TAN Ju-xian, HAN Jie-si, SHI Hao. Survey of Graph Matching Algorithms[J]. Computer Science, 2018, 45(6): 27 -31 .
[7] CUI Yi-hui, SONG Wei, PENG Zhi-yong, YANG Xian-di. Mining Method of Association Rules Based on Differential Privacy[J]. Computer Science, 2018, 45(6): 36 -40 .
[8] RAN Zheng, LUO Lei, YAN Hua, LI Yun. Nash Equilibrium Based Method for Mapping AUTOSAR Tasks to Multicore ECU[J]. Computer Science, 2018, 45(6): 166 -171 .
[9] LAI Wen-xing, DENG Zhong-min. Improved NSGA2 Algorithm Based on Dominant Strength[J]. Computer Science, 2018, 45(6): 187 -192 .
[10] JI Hai-juan, ZHOU Cong-hua, LIU Zhi-feng. Symbolic Aggregate Approximation Method of Time Series Based on Beginning and End Distance[J]. Computer Science, 2018, 45(6): 216 -221 .