计算机科学 ›› 2021, Vol. 48 ›› Issue (10): 220-225.doi: 10.11896/jsjkx.200800073
许华杰1,2, 杨洋1, 李桂兰3
XU Hua-jie1,2, YANG Yang1, LI Gui-lan3
摘要: 材质识别旨在识别自然材质图像中的主要对象及其所属材料类别。针对材质图像数据集通常数据量少、人工标注局部纹理区域困难所导致的材质识别准确率低的问题,提出了一种基于注意力机制和深度卷积神经网络的材质识别方法,该方法的核心是材质识别深度卷积神经网络(MaterialNet)。MaterialNet利用深度残差网络对图像进行特征提取,采用所提出的级联空洞空间金字塔池化的方式引入注意力机制,使网络可以通过端到端训练自适应地关注包含纹理特征的关键区域,从而有效识别材质的局部纹理特征。在FMD材质数据集上进行实验,结果表明,MaterialNet的总体识别准确率可达到82.3%,比当前主流的B-CNN和CNN+FV材质识别方法分别提高了7.2%和4.5%,对多种材质的识别准确率较高且具有参数量少、计算量小等优点。
中图分类号:
[1]LIU L,ZHAO L J,GUO C Y.Texture Classification:State-of-the-art Methods and Prospects[J].Acta Automatica Sinica,2018,44(4):584-60. [2]BELL S,UPCHURCH P,SNAVELY N,et al.Material Recognition in the Wild with the Materials in Context Database[C]//The 2015 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Boston,MA,USA,2015(1):3479-3487. [3]CIMPOI M,MAJI S,KOKKINOS I,et al.Describing texturesin the wild[C]//The 2014 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Columbus,OH,USA:IEEE,2014:3606-3613. [4]DENG R,LIN J C,YANG H Z.Building Identification Based on Deep Learning[J].Journal of Chongqing Technology and Business University(Natural Science Edition),2019,36(4):17-22. [5]YANG W G,HUAI Y T.Flower Image Enhancement and Classification Based on Deep Convolution Generative Adversarial Network[J].Computer Science,2020,47(6):176-179. [6]CIMPOI M,MAJI S,VEDALDI A.Deep filter banks for texture recognition and segmentation[C]//The 2015 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Boston,Massachusetts,USA:IEEE,2015:3828-3836. [7]LIU L,CHEN J,PIEGUTH P,et al.From BoW to CNN:Two decades of texture representation for texture classification[C]//Preceedings of International Journal of Computer Vision.2019(127):74-109. [8]SHARAN L,ROSENHOLTZ R,ADELSON E H.Accuracy and speed of material categorization in real-world images[J].Journal of Vision,2014,14(9):1-24. [9]BU X Y,WU Y W,GAO Z,et al.Deep convolutional network with locality and sparsity constrains for texture classification[J].Pattern Recogition,2019(91):34-46. [10]LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear CNNmodels for fine-grained visual recognition[C]//The 2015 IEEE International Conference on Computer Vision(ICCV).Santiago,Chile:IEEE,2015:1449-1457. [11]XU K,BA J,KIROS R,et al.Show,Attend and Tell:NeuralImage Caption Generation with Visual Attention[C]//Internatio-nal Conference on Machine Learning(ICML).PMLR,2015:2048-2057. [12]LIU Y,JIN Z.Fine-grained Image Recognition Method Combining with Non-local and Multi-region Attention Mechanism[J].Computer Science,2021,48(1):197-203. [13]HU J,LI S,GANG S.Squeeze-and-Excitation Networks[C]//The 2018 IEEE Conference on Computer Vision and pattern Recognition(CVPR).Salt Lake City,UT,USA,2018:7132-7141. [14]BA J,MNIH V,KAVUKCUOGLU K.Multiple Object Recognition with Visual Attention[EB/OL].https://arxiv.org/pdf/1412.7755.pdf. [15]HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//The International Conference of Computer Vision and Pattern Recognition (CVPR).2016:770-778. [16]LI X,WANG W,HU X,et al.Selective kernel networks[C]//The international Conference of Computer Vision and Pattern Recognition (CVPR).2019:510-519. [17]CHEN L,PAPANDREOU G,SCHROFF F,et al.Rethinking Atrous Convolution for Semantic Image Segmentation[EB/OL].https://arxiv.org/pdf/1706.05587.pdf. [18]SHANRAN L,LIU C,ROSENHOLTZ R,et al.Recognizing materials using perceptually inspired features[J].International Journal of Computer Vision,2013,103(3):348-371. [19]HE K,ZHANG X,REN S,et al.Spatial pyramid pooling in deep convolutionl networks for visual recognition[J].IEEE Transactions on Pattern Analysis and Machine Intellegence,2015,37(9):1904-1916. [20]CHEN L C,PAPANDREOU G,KOKKINOS I,et al.DeepLab:Semantic image segmentation with deep convolutional nets,atrous convolution,and fully connected CRFs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,40(4):838-848. [21]ZHOU B,KHOSLA A,LAPEDRIZA A,et al.Learning deep features for discrimination localization[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:2921-2929. |
[1] | 饶志双, 贾真, 张凡, 李天瑞. 基于Key-Value关联记忆网络的知识图谱问答方法 Key-Value Relational Memory Networks for Question Answering over Knowledge Graph 计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277 |
[2] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[3] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[4] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
[5] | 熊丽琴, 曹雷, 赖俊, 陈希亮. 基于值分解的多智能体深度强化学习综述 Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization 计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112 |
[6] | 姜梦函, 李邵梅, 郑洪浩, 张建朋. 基于改进位置编码的谣言检测模型 Rumor Detection Model Based on Improved Position Embedding 计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046 |
[7] | 汪鸣, 彭舰, 黄飞虎. 基于多时间尺度时空图网络的交通流量预测模型 Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction 计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188 |
[8] | 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥. 基于注意力机制的医学影像深度哈希检索算法 Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism 计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153 |
[9] | 孙奇, 吉根林, 张杰. 基于非局部注意力生成对抗网络的视频异常事件检测方法 Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection 计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061 |
[10] | 闫佳丹, 贾彩燕. 基于双图神经网络信息融合的文本分类方法 Text Classification Method Based on Information Fusion of Dual-graph Neural Network 计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042 |
[11] | 金方焱, 王秀利. 融合RACNN和BiLSTM的金融领域事件隐式因果关系抽取 Implicit Causality Extraction of Financial Events Integrating RACNN and BiLSTM 计算机科学, 2022, 49(7): 179-186. https://doi.org/10.11896/jsjkx.210500190 |
[12] | 熊罗庚, 郑尚, 邹海涛, 于化龙, 高尚. 融合双向门控循环单元和注意力机制的软件自承认技术债识别方法 Software Self-admitted Technical Debt Identification with Bidirectional Gate Recurrent Unit and Attention Mechanism 计算机科学, 2022, 49(7): 212-219. https://doi.org/10.11896/jsjkx.210500075 |
[13] | 彭双, 伍江江, 陈浩, 杜春, 李军. 基于注意力神经网络的对地观测卫星星上自主任务规划方法 Satellite Onboard Observation Task Planning Based on Attention Neural Network 计算机科学, 2022, 49(7): 242-247. https://doi.org/10.11896/jsjkx.210500093 |
[14] | 张颖涛, 张杰, 张睿, 张文强. 全局信息引导的真实图像风格迁移 Photorealistic Style Transfer Guided by Global Information 计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036 |
[15] | 曾志贤, 曹建军, 翁年凤, 蒋国权, 徐滨. 基于注意力机制的细粒度语义关联视频-文本跨模态实体分辨 Fine-grained Semantic Association Video-Text Cross-modal Entity Resolution Based on Attention Mechanism 计算机科学, 2022, 49(7): 106-112. https://doi.org/10.11896/jsjkx.210500224 |
|