计算机科学 ›› 2022, Vol. 49 ›› Issue (6A): 353-357.doi: 10.11896/jsjkx.210200169

• 图像处理&多媒体技术 • 上一篇    下一篇

一种结合双注意力机制和层次网络结构的细碎农作物分类方法

杨健楠1, 张帆2   

  1. 1 南京工业大学计算机科学与技术学院 南京 211816
    2 IBM Watson Group 利特尔顿 马萨诸塞州 01460
  • 出版日期:2022-06-10 发布日期:2022-06-08
  • 通讯作者: 杨健楠(yjn@njtech.edu.cn)

Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure

YANG Jian-nan1, ZHANG Fan2   

  1. 1 Computer Science and Technology,Nanjing Tech University,Nanjing 211816,China
    2 IBM Watson Group,Littleton,MA 01460,USA
  • Online:2022-06-10 Published:2022-06-08
  • About author:YANG Jian-nan,born in 1995,postgraduate.His main research interests include image recognition and computer vision.

摘要: 细碎农作物由于单一样本的尺寸较小,单一样本之间具有一定的差异性,不能代表整个样本的特征,并且同种样本的不同等级在形状和颜色上非常相似,使得细碎农作物图像识别具有非常大的挑战性。目前,对干茶叶、大米、大豆等细碎农作物的图像分类方法的研究较为匮乏,并且研究数据集大多是在实验室环境下使用专业的设备进行拍摄的,这给实际应用带来了困难。为此,提出了一种使用手机对细碎农作物样本进行图像采集和处理的方案,并以茶叶和大米样本为例,设计了一种结合双注意力机制的层次网络结构,通过粗粒度-细粒度的分类过程,先进行粗粒度分类,即样本的不同类别,然后结合注意力机制,使网络更加关注同种类别下不同等级的样本之间的差异,从而更精确地对样本进行等级分类。最后,所提方法在采集的数据集上达到了93.9%的识别精度。

关键词: 层次网络结构, 卷积神经网络, 图像分类, 细碎农作物, 注意力机制

Abstract: The image recognition of small crops is very challenging for several reasons.First,the crop is small in size and a single sample is not representative of a collection.Second,different categories or different grades of the same crop may look very similar in shapes and colors.At present,there is a lack of research on image classification methods for small crops such as dried tea,rice and soybean,and most of the research datasets are taken in the laboratory environment with professional equipment,which brings difficulties to the practical application.For this,a method for image acquisition and processing of small crop samples using mobile phones is proposed.By taking tea and rice as a case study,we design a hierarchical network structure combined with two attention mechanisms.Through the coarse-grained to fine-grained classification process,coarse-grained classification is made first,namely different categories of samples,and then combined with two attention mechanisms,the network pays more attention to the diffe-rences between similar samples of different grades under the same category,so that they can be more accurate to classification of samples.Finally,the proposed method achieves the accuracy of 93.9% on the collected datasets.

Key words: Attention mechanism, Convolutional neural network, Hierarchical network structure, Image classification, Small crops

中图分类号: 

  • TP301.6
[1] LI S,ZHOU K,CHENG W Q,et al.Research progress of teaquality monitoring based on image technology[J].Science and Technology of Modern Agriculture,2019,736(2):202-204,208.
[2] XU M,WANG J,GU S.Rapid identification of tea quality byE-nose and computer vision combining with a synergetic data fusion strategy[J].Journal of Food Engineering,2019,241:10-17.
[3] LIU P,WU K M,YANG P X,et al.Study of sensory qualityevaluation of tea using computer vision technology and forest random method[J].Spectroscopy and Spectral Analysis,2019,39(1):193-198.
[4] YU H,WU R M,AI S R,et al.Study on computer vision classification of tea quality based on PCA-PSO-LSSVM[J].Laser Journal,2017,1:55-58.
[5] SONG Y,XIE H,NING J,et al.Grading Keemun black teabased on shape feature parameters of machine vision[J].Nongye Gongcheng Xuebao/Transactions of the Chinese Society of Agricultural Engineering,2018,34(23):279-286.
[6] IZQUIERDO M,LASTRA-MEJÍAS M,GONZÁLEZ-FLORES E,et al.Visible imaging to convolutionally discern and authenticate varieties of rice and their derived flours[J].Food Control,2020,110:106971.
[7] TOUSCH A M,HERBIN S,AUDIBERT J Y.Semantic hierarchies for image annotation:A survey[J].Pattern Recognition,2012,45(1):333-345.
[8] GAO T,KOLLER D.Discriminative learning of relaxed hierarchy for large-scale visual recognition[C]//2011 International Conference on Computer Vision.IEEE,2011:2072-2079.
[9] DENG J,KRAUSE J,BERG A C,et al.Hedging your bets:Optimizing accuracy-specificity trade-offs in large scale visual re-cognition[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2012:3450-3457.
[10] LIU B,SADEGHI F,TAPPEN M,et al.Probabilistic label trees for efficient large scale image classification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2013:843-850.
[11] XU M,WANG J,GU S.Rapid identification of tea quality byE-nose and computer vision combining with a synergetic data fusion strategy[J].Journal of Food Engineering,2018,241:10-17.
[12] AUKKAPINYO K,SAWANGWONG S,POOYOI P,et al.Localization and Classification of Rice-grain Images Using Region Proposals-based Convolutional Neural Network[J].Internatio-nal Journal of Automation & Computing,2020(2):233-246.
[13] SON N H,THAI N.Deep Learning for Rice Quality Classification[C]//2019 International Conference on Advanced Computing and Applications (ACOMP).2019.
[14] SONG Y,XIE H F,NING J M,et al.Grade identification of qimen black tea based on machine visual shape feature parameters[J].Transactions of the Chinese Society of Agricultural Engineering (Transactions of the CSAE),2018,34(23):279-286.
[15] IZQUIERDO M,MIGUEL L M,ESTER G F,et al.Visible imaging to convolutionally discern and authenticate varieties of rice and their derived flours[J].Food Control,2019,110:106971.
[16] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014.
[17] CAO Y,XU J,LIN S,et al.GCNet:Non-local Networks Meet Squeeze-Excitation Networks and Beyond[C]//IEEE/CVF International Conference on Computer Vision Workshop.2019:1971-1980.
[18] WANG X,GIRSHICK R,GUPTA A,et al.Non-local neuralnetworks[C]//Proceedings of the IEEE Conference on Compu-ter Vision and Pattern Recognition.2018:7794-7803.
[19] HU J,SHEN L,SUN G,et al.Squeeze-and-Excitation Networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2020,42(8):2011-2023.
[20] HE K,ZHANG X,REN S,et al.Deep residual learning forimage recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2016:770-778.
[21] LIN T Y,ROYCHOWDHURY A,MAJI S.Bilinear cnn models for fine-grained visual recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.2015:1449-1457.
[22] FU J,ZHENG H,MEI T.Look closer to see better:Recurrent attention convolutional neural network for fine-grained image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2017:4438-4446.
[23] MOGHIMI M,BELONGIE S J,SABERIAN M J,et al.Boosted Convolutional Neural Networks[C]//BMVC.2016.
[24] ZHENG H,FU J,MEI T,et al.Learning multi-attention convolutional neural network for fine-grained image recognition[C]//Proceedings of the IEEE International Conference on Computer Vision.2017:5209-5217.
[25] YANG Z,LUO T,WANG D,et al.Learning to navigate forfine-grained classification[C]//Proceedings of the European Conference on Computer Vision (ECCV).2018:420-435.
[26] ZHENG H,FU J,ZHA Z J,et al.Looking for the devil in the details:Learning trilinear attention sampling network for fine-grained image recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2019:5012-5021.
[1] 周芳泉, 成卫青.
基于全局增强图神经网络的序列推荐
Sequence Recommendation Based on Global Enhanced Graph Neural Network
计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085
[2] 戴禹, 许林峰.
基于文本行匹配的跨图文本阅读方法
Cross-image Text Reading Method Based on Text Line Matching
计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032
[3] 周乐员, 张剑华, 袁甜甜, 陈胜勇.
多层注意力机制融合的序列到序列中国连续手语识别和翻译
Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion
计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026
[4] 熊丽琴, 曹雷, 赖俊, 陈希亮.
基于值分解的多智能体深度强化学习综述
Overview of Multi-agent Deep Reinforcement Learning Based on Value Factorization
计算机科学, 2022, 49(9): 172-182. https://doi.org/10.11896/jsjkx.210800112
[5] 饶志双, 贾真, 张凡, 李天瑞.
基于Key-Value关联记忆网络的知识图谱问答方法
Key-Value Relational Memory Networks for Question Answering over Knowledge Graph
计算机科学, 2022, 49(9): 202-207. https://doi.org/10.11896/jsjkx.220300277
[6] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[7] 汪鸣, 彭舰, 黄飞虎.
基于多时间尺度时空图网络的交通流量预测模型
Multi-time Scale Spatial-Temporal Graph Neural Network for Traffic Flow Prediction
计算机科学, 2022, 49(8): 40-48. https://doi.org/10.11896/jsjkx.220100188
[8] 李宗民, 张玉鹏, 刘玉杰, 李华.
基于可变形图卷积的点云表征学习
Deformable Graph Convolutional Networks Based Point Cloud Representation Learning
计算机科学, 2022, 49(8): 273-278. https://doi.org/10.11896/jsjkx.210900023
[9] 姜梦函, 李邵梅, 郑洪浩, 张建朋.
基于改进位置编码的谣言检测模型
Rumor Detection Model Based on Improved Position Embedding
计算机科学, 2022, 49(8): 330-335. https://doi.org/10.11896/jsjkx.210600046
[10] 陈泳全, 姜瑛.
基于卷积神经网络的APP用户行为分析方法
Analysis Method of APP User Behavior Based on Convolutional Neural Network
计算机科学, 2022, 49(8): 78-85. https://doi.org/10.11896/jsjkx.210700121
[11] 朱承璋, 黄嘉儿, 肖亚龙, 王晗, 邹北骥.
基于注意力机制的医学影像深度哈希检索算法
Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism
计算机科学, 2022, 49(8): 113-119. https://doi.org/10.11896/jsjkx.210700153
[12] 孙奇, 吉根林, 张杰.
基于非局部注意力生成对抗网络的视频异常事件检测方法
Non-local Attention Based Generative Adversarial Network for Video Abnormal Event Detection
计算机科学, 2022, 49(8): 172-177. https://doi.org/10.11896/jsjkx.210600061
[13] 檀莹莹, 王俊丽, 张超波.
基于图卷积神经网络的文本分类方法研究综述
Review of Text Classification Methods Based on Graph Convolutional Network
计算机科学, 2022, 49(8): 205-216. https://doi.org/10.11896/jsjkx.210800064
[14] 闫佳丹, 贾彩燕.
基于双图神经网络信息融合的文本分类方法
Text Classification Method Based on Information Fusion of Dual-graph Neural Network
计算机科学, 2022, 49(8): 230-236. https://doi.org/10.11896/jsjkx.210600042
[15] 张颖涛, 张杰, 张睿, 张文强.
全局信息引导的真实图像风格迁移
Photorealistic Style Transfer Guided by Global Information
计算机科学, 2022, 49(7): 100-105. https://doi.org/10.11896/jsjkx.210600036
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!