基于多分支注意力增强的细粒度图像分类

doi:10.11896/jsjkx.210100108

Abstract

Abstract: In order to address the challenges of high intra-class variances and low inter-class variances in fine-grained image classification,a multi-branch attention-augmented convolution neural network is proposed to solve the problem.The pre-trained Inception-V3 network is used to extract basic feature.In order to solve the problem that features are extracted from one part of an object and encourage the network to pay more attention to the discriminative features of different parts,we apply self-constrained attention-wised cropping and self-constrained attention-wised erasing on the central parts of the original images.It also improves the detection accuracy of object locations.Meanwhile,a central regularization loss function is proposed to constrain attention-augmented training process to obtain better attention regions and expand the gap between different classes of images.Comprehensive experiments on three benchmark datasets show that our approach surpasses the state-of-art works.

Key words: Central regularization loss, Convolutional neural network, Fine-grained image classification, Multi-branch attention-augmentation, Weakly supervised learning

CLC Number:

TP391

ZHANG Wen-xuan, WU Qin. Fine-grained Image Classification Based on Multi-branch Attention-augmentation[J].Computer Science, 2022, 49(5): 105-112.

References

[1]WELINDER P,BRANSON S,MITA T,et al.The Caltech-UCSD Birds-200-2011 Dataset[R].California Institute of Technology,2011:1-15.
[2]RABIEE H,HADDADNIA J,MOUSAVI H,et al.Novel dataset for fine-grained abnormal behavior understanding in crowd[C]//IEEE International Conference on Advanced Video & Signal Based Surveillance.2016:121-130.
[3]YANG W G,HUAI Y J.Flower Image Enhancement and Classification Based on Deep Convolution Generative Adversarial Network[J].Computer Science,2020,47(6):176-179.
[4]KRAUSE J,STARK M,DENG J,et al.3D Object Representations for Fine-Grained Categorization[C]//IEEE International Conference on Computer Vision Workshops.2013:554-561.
[5]MAJI S,RAHTU E,KANNALA J,et al.Fine-Grained VisualClassification of Aircraft[C]//IEEE International Conference on Advanced Video & Signal Based Surveillance.2013:1-6.
[6]PERRONNIN F,DANCE C.Fisher Kernels on Visual Vocabularies for Image Categorization[C]//2007 IEEE Conference on Computer Vision and Pattern Recognition.2007:1-8.
[7]SÁNCHEZ J,MENSINK T,VERBEEK J.Image Classification with the Fisher Vector:Theory and Practice[J].International Journal of Computer Vision,2013,105(1):222-245.
[8]LOWE D G.Object recognition from local scale-invariant fea-tures[C]//Proceedings of the Seventh IEEE International Conference on Computer Vision.1999:1150-1157.
[9]DALAL N,TRIGGS B.Histograms of Oriented Gradients for Human Detection[C]//IEEE Computer Society Conference on Computer Vision & Pattern Recognition.2005.
[10]DONAHUE J,JIA Y Q,VINYALS O,et al.DeCAF:A Deep Convolutional Activation Feature for Generic Visual Recognition[C]//Proceedings of the 31st International Conference on Machine Learning.PMLR,2014:647-655.
[11]HE K,ZHANG X,REN S,et al.Deep Residual Learning forImage Recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2016:770-778.
[12]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[C]//ICLR.2015:1-14.
[13]SZEGEDY C,VANHOUCKE V,IOFFE S,et al.Rethinking the Inception Architecture for Computer Vision[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2016:2818-2826.
[14]XIE L,HUANG C.A Residual Network of Water Scene Recognition Based on Optimized Inception Module and Convolutional Block Attention Module[C]//2019 6th International Conference on Systems and Informatics (ICSAI).2019:1174-1178.
[15]SUN G,CHOLAKKAL H,KHAN S,et al.Fine-Grained Recognition:Accounting for Subtle Differences between Similar Classes[J].Proceedings of the AAAI Conference on Artificial Intelligence,2020,34(1):12047-12054.
[16]TAN M,WANG G,ZHOU J,et al.Fine-Grained Classification via Hierarchical Bilinear Pooling With Aggregated Slack Mask[J].IEEE Access,2017,7(1):117944-117953.
[17]YAO B,BRADSKI G,LI F F.A codebook-free and annotation-free approach for fine-grained image categorization[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.2012:3466-3473.
[18]CHERIYADAT A M.Unsupervised Feature Learning for Aerial Scene Classification[J].IEEE Transactions on Geoscience and Remote Sensing,2014,52(1):439-451.
[19]ZHANG N,DONAHUE J,GIRSHICK R,et al.Part-based R-CNNs for Fine-grained Category Detection[C]//European Conference on Computer Vision(ECCV).2014:834-849.
[20]HE K,GKIOXARI G,DOLLÁR P,et al.Mask R-CNN[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(2):386-397.
[21]LONG J,SHELHAMER E,DARRELL T.Fully convolutional networks for semantic segmentation[C]//2015 IEEE Confe-rence on Computer Vision and Pattern Recognition (CVPR).2015:3431-3440.
[22]GE W,LIN X,YU Y.Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2019:3029-3038.
[23]XIAO T J,XU Y C,YANG K Y,et al.The application of two-level attention models in deep convolutional neural network for fine-grained image classification[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).2015:842-850.
[24]LIN T,ROYCHOWDHURY A,MAJI S.Bilinear CNN Models for Fine-Grained Visual Recognition[C]//2015 IEEE International Conference on Computer Vision (ICCV).2015:1449-1457.
[25]ZHOU M,BAI Y,ZHANG W,et al.Look-Into-Object:Self-Supervised Structure Modeling for Object Recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2020:11771-11780.
[26]LIU C,XIE H,ZHA Z J,et al.Filtration and Distillation:Enhancing Region Attention for Fine-Grained Visual Categorization[C]//AAAI Conference on Artificial Intelligence.2020:11555-11562.
[27]HUANG S,WANG X,DAO D.SnapMix:Semantically Proportional Mixing for Augmenting Fine-grained Data[C]//AAAI Conference on Artificial Intelligence.2021:1-8.
[28]WU J,XU J,DING T.Fine-grained Image Classification Algorithm Based on Ensemble Methods of Transfer Learning[J].Journal of Chongqing University of Posts and Telecommunications(Natural Science Edition),2020,32(3):452-458.
[29]ZHENG H,FU J,MEI T,et al.Learning Multi-attention Con-volutional Neural Network for Fine-Grained Image Recognition[C]//2017 IEEE International Conference on Computer Vision (ICCV).2017:5219-5227.
[30]SUN M,YUAN Y,ZHOU F,et al.Multi-Attention Multi-Class Constraint for Fine-grained Image Recognition[C]//European Conference on Computer Vision(ECCV).2018:834-850.
[31]YANG Z,LUO T,WANG D,et al.Springer International Publishing Learning to Navigate for Fine-Grained Classification[C]//European Conference on Computer Vision(ECCV).2018:438-454.
[32]LUO W,ZHANG H,LI J,et al.Learning Semantically En-hanced Feature for Fine-Grained Image Classification[J].IEEE Signal Processing Letters,2020,27:1545-1549.
[33]CHEN Y,BAI Y,ZHANG W,et al.Destruction and Construction Learning for Fine-Grained Image Recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).2019:5152-5161.
[34]HU T,QI H.See Better Before Looking Closer:Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification[J/OL].https://arxiv.org/abs/1901.09891.
[35]ZHAO B,WU X,FENG J,et al.Diversified Visual Attention Networks for Fine-Grained Object Classification[J].IEEE Transactions on Multimedia,2017,19(6):1245-1256.
[36]DUBEY A,GUPTA O,GUO P,et al.Pairwise Confusion for Fine-Grained Visual Classification[C]//European Conference on Computer Vision(ECCV).2018:71-88.

Related Articles 15

[1]	ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[2]	CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[3]	ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[4]	DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[5]	LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[6]	XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[7]	WU Zi-bin, YAN Qiao. Projected Gradient Descent Algorithm with Momentum [J]. Computer Science, 2022, 49(6A): 178-183.
[8]	ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[9]	WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[10]	SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[11]	YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[12]	YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[13]	ZHAO Zheng-peng, LI Jun-gang, PU Yuan-yuan. Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network [J]. Computer Science, 2022, 49(6): 199-209.
[14]	LIU Lin-yun, CHEN Kai-yan, LI Xiong-wei, ZHANG Yang, XIE Fang-fang. Overview of Side Channel Analysis Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 296-302.
[15]	ZHAO Ren-xing, XU Pin-jie, LIU Yao. ECG-based Atrial Fibrillation Detection Based on Deep Convolutional Residual Neural Network [J]. Computer Science, 2022, 49(5): 186-193.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Fine-grained Image Classification Based on Multi-branch Attention-augmentation

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0