Computer Science ›› 2021, Vol. 48 ›› Issue (10): 220-225.doi: 10.11896/jsjkx.200800073

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Material Recognition Method Based on Attention Mechanism and Deep Convolutional Neural Network

XU Hua-jie1,2, YANG Yang1, LI Gui-lan3   

  1. 1 College of Computer and Electronic Information,Guangxi University,Nanning 530004,China
    2 Guangxi Key Laboratory of Multimedia Communications and Network Technology,Nanning 530004,China
    3 Guangxi Institute of Product Quality Inspection,Nanning 530007,China
  • Received:2020-08-11 Revised:2020-11-18 Online:2021-10-15 Published:2021-10-18
  • About author:XU Hua-jie,born in 1974,Ph.D,asso-ciate professor,is a senior member of China Computer Federation.His main research interests include artificial intelligence,acoustic signal recognition and computer vision.
  • Supported by:
    Science and Technology Plan Project of Guangxi Zhuang Autonomous Region(2017AB15008) and Science and Technology Plan Project of Chongzuo(FB2018001).

Abstract: The purpose of material recognition is to identify the main objects and their material categories in natural material images.Aiming at the problem of low recognition accuracy caused by the lack of data in material image data sets and the difficulty of manually labeling local texture regions,a material recognition method based on attention mechanism and deep convolutional neural network is proposed.The core of the method is material recognition deep convolutional neural network (MaterialNet).MaterialNet uses the deep residual network to extract the features of the image,and introduces the attention mechanism by the proposed cascaded atrous spatial pyramid pooling method,so that the network can adaptively focus on the key areas containing texture features through end-to-end training,so as to effectively identify the local texture features of materials.Based on the FMD material datasets,the experimental results show that the overall identification accuracy of MaterialNet is 82.3%,which is 7.2% and 4.5% higher than the current mainstream B-CNN and CNN+FV material identification methods,respectively.The recognition accuracy of MaterialNet is high for a variety of materials,and it has the advantages of less parameters and less calculation.

Key words: Atrous convolution, Attention mechanism, Deep convolutional neural network, Spatial pyramid pooling

CLC Number: 

  • TP391
[1]LIU L,ZHAO L J,GUO C Y.Texture Classification:State-of-the-art Methods and Prospects[J].Acta Automatica Sinica,2018,44(4):584-60.
[2]BELL S,UPCHURCH P,SNAVELY N,et al.Material Recognition in the Wild with the Materials in Context Database[C]//The 2015 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Boston,MA,USA,2015(1):3479-3487.
[3]CIMPOI M,MAJI S,KOKKINOS I,et al.Describing texturesin the wild[C]//The 2014 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Columbus,OH,USA:IEEE,2014:3606-3613.
[4]DENG R,LIN J C,YANG H Z.Building Identification Based on Deep Learning[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2019,36(4):17-22.
[5]YANG W G,HUAI Y T.Flower Image Enhancement and Classification Based on Deep Convolution Generative Adversarial Network[J].Computer Science,2020,47(6):176-179.
[6]CIMPOI M,MAJI S,VEDALDI A.Deep filter banks for texture recognition and segmentation[C]//The 2015 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Boston,Massachusetts,USA:IEEE,2015:3828-3836.
[7]LIU L,CHEN J,PIEGUTH P,et al.From BoW to CNN:Two decades of texture representation for texture classification[C]//Preceedings of International Journal of Computer Vision.2019(127):74-109.
[8]SHARAN L,ROSENHOLTZ R,ADELSON E H.Accuracy and speed of material categorization in real-world images[J].Journal of Vision,2014,14(9):1-24.
[9]BU X Y,WU Y W,GAO Z,et al.Deep convolutional network with locality and sparsity constrains for texture classification[J].Pattern Recogition,2019(91):34-46.
[10]LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNNmodels for fine-grained visual recognition[C]//The 2015 IEEE International Conference on Computer Vision(ICCV).Santiago,Chile:IEEE,2015:1449-1457.
[11]XU K,BA J,KIROS R,et al.Show,Attend and Tell:NeuralImage Caption Generation with Visual Attention[C]//Internatio-nal Conference on Machine Learning(ICML).PMLR,2015:2048-2057.
[12]LIU Y,JIN Z.Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism[J].Computer Science,2021,48(1):197-203.
[13]HU J,LI S,GANG S.Squeeze-and-Excitation Networks[C]//The 2018 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Salt Lake City,UT,USA,2018:7132-7141.
[14]BA J,MNIH V,KAVUKCUOGLU K.Multiple Object Recognition with Visual Attention[EB/OL].https://arxiv.org/pdf/1412.7755.pdf.
[15]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//The International Conference of Computer Vision and Pattern Recognition (CVPR).2016:770-778.
[16]LI X,WANG W,HU X,et al.Selective kernel networks[C]//The international Conference of Computer Vision and Pattern Recognition (CVPR).2019:510-519.
[17]CHEN L,PAPANDREOU G,SCHROFF F,et al.Rethinking Atrous Convolution for Semantic Image Segmentation[EB/OL].https://arxiv.org/pdf/1706.05587.pdf.
[18]SHANRAN L,LIU C,ROSENHOLTZ R,et al.Recognizing materials using perceptually inspired features[J].International Journal of Computer Vision,2013,103(3):348-371.
[19]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutionl networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intellegence,2015,37(9):1904-1916.
[20]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,40(4):838-848.
[21]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Learning deep features for discrimination localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2921-2929.
[1] ZHOU Fang-quan, CHENG Wei-qing. Sequence Recommendation Based on Global Enhanced Graph Neural Network [J]. Computer Science, 2022, 49(9): 55-63.
[2] DAI Yu, XU Lin-feng. Cross-image Text Reading Method Based on Text Line Matching [J]. Computer Science, 2022, 49(9): 139-145.
[3] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[4] XIONG Li-qin, CAO Lei, LAI Jun, CHEN Xi-liang. Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization [J]. Computer Science, 2022, 49(9): 172-182.
[5] RAO Zhi-shuang, JIA Zhen, ZHANG Fan, LI Tian-rui. Key-Value Relational Memory Networks for Question Answering over Knowledge Graph [J]. Computer Science, 2022, 49(9): 202-207.
[6] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[7] SUN Qi, JI Gen-lin, ZHANG Jie. Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection [J]. Computer Science, 2022, 49(8): 172-177.
[8] YAN Jia-dan, JIA Cai-yan. Text Classification Method Based on Information Fusion of Dual-graph Neural Network [J]. Computer Science, 2022, 49(8): 230-236.
[9] WANG Ming, PENG Jian, HUANG Fei-hu. Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction [J]. Computer Science, 2022, 49(8): 40-48.
[10] JIANG Meng-han, LI Shao-mei, ZHENG Hong-hao, ZHANG Jian-peng. Rumor Detection Model Based on Improved Position Embedding [J]. Computer Science, 2022, 49(8): 330-335.
[11] JIN Fang-yan, WANG Xiu-li. Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM [J]. Computer Science, 2022, 49(7): 179-186.
[12] XIONG Luo-geng, ZHENG Shang, ZOU Hai-tao, YU Hua-long, GAO Shang. Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism [J]. Computer Science, 2022, 49(7): 212-219.
[13] PENG Shuang, WU Jiang-jiang, CHEN Hao, DU Chun, LI Jun. Satellite Onboard Observation Task Planning Based on Attention Neural Network [J]. Computer Science, 2022, 49(7): 242-247.
[14] ZHANG Ying-tao, ZHANG Jie, ZHANG Rui, ZHANG Wen-qiang. Photorealistic Style Transfer Guided by Global Information [J]. Computer Science, 2022, 49(7): 100-105.
[15] ZENG Zhi-xian, CAO Jian-jun, WENG Nian-feng, JIANG Guo-quan, XU Bin. Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism [J]. Computer Science, 2022, 49(7): 106-112.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!