计算机科学 ›› 2020, Vol. 47 ›› Issue (1): 136-143.doi: 10.11896/jsjkx.181202316
吕永强1,2,闵巍庆2,段华1,蒋树强2
LV Yong-qiang1,2,MIN Wei-qing2,DUAN Hua1,JIANG Shu-qiang2
摘要: 食品识别在食品健康和智能家居等领域获得了广泛关注。目前大部分的食品识别工作是基于大规模标记样本的深度神经网络,这些工作无法有效地识别只有少量样本的类别,因此小样本食品识别是一个亟待解决的问题。目前基于度量学习的小样本识别方法着重于探究样本之间的相似度信息,忽略了类内与类间更加细粒度的区分。学习类内与类间区分信息的主流方法是基于线性度量函数的三元卷积神经网络,然而对于食品图像而言,线性度量函数的鉴别能力不足。为此,引入可学习的关系网络作为三元卷积神经网络的非线性度量函数,进一步提出了一种基于非线性度量的三元神经网络用于小样本食品识别方法。该方法使用三元神经网络学习图像的特征嵌入表示,然后采用鉴别能力更强的关系网络作为非线性度量函数,基于端到端的训练方式来学习类内与类间更加细粒度的区分信息。此外,提出了一种可以使模型训练更加稳定的三元组样本在线采样方案。通过在Food-101,VIREO Food-172和ChineseFoodNet食品数据集上的实验结果可知,相比基于孪生网络的小样本学习方法,所提方法的性能平均提高了3.0%,相比基于线性度量函数的三元神经网络的方法,所提方法的性能平均提升了1.0%。文中还探究了损失函数的阈值、三元组采样的参数和初始化方式对实验性能的影响。
中图分类号:
[1]BOSSARD L,GUILLAUMIN M,VANGOOL L.Food-101-mining discriminative components with random forests[C]∥European Conference on Computer Vision.2014:446-461. [2]AO S,LING C X.Adapting new categories for food recognition with deep representation[C]∥IEEE International Conference on Data Mining Workshop.2015:1196-1203. [3]HERRANZ L,JIANG S,XU R.Modeling restaurant context for food recognition[J].IEEE Transactions on Multimedia,2017,19(2):430-440. [4]AIZAWA K,MARUYAMA Y,LI H,et al.Food balance estimation by using personal dietary tendencies in a multimedia foodlog[J].IEEE Transactions on Multimedia,2013,15(8):2176-2185. [5]ZHENG J,WANG Z J,ZHU C.Food image recognition via superpixel based low-level and mid-level distance coding for smart home applications[J].Sustainability,2017,9(5):856. [6]BOLANOS M,FERRA A,RADEVA P.Food ingredients recognition through multi-label learning[C]∥International Confe-rence on Image Analysis and Processing.2017:394-402. [7]ZHANG N,DONAHUE J,GIRSHICK R,et al.Part-based r-cnns for fine-grained category detection[C]∥European Conference on Computer Vision.2014:834-849. [8]CHRISTODOULIDIS S,ANTHIMOPOULOS M,MOUGIA- KAKOU S.Food recognition for dietary assessment using deep convolutional neural networks[C]∥International Conference on Image Analysis and Processing.2015:58-465. [9]MARTINEL,NIKI,FORESTI G,et al.Wide-Slice Residual Networks for Food Recognition[C]∥IEEE Winter Conference on Applications of Computer Vision IEEE Computer Society.2018:567-576. [10]KOCH G,ZEMEL R,SALAKHUTDINOV R.Siamese neural networks for one-shot image recognition[C]∥International Conference on Machine Learning.2015. [11]VINYALS O,BLUNDELL C,LILLICRAP T,et al.Matching networks for one shot learning[C]∥Advances in Neural Information Processing Systems.2016:3630-3638. [12]SUNG F,YANG Y,ZHANG L,et al.Learning to compare:Relation network for few-shot learning[C]∥IEEE Computer Society Conference on Computer Vision and Pattern Recognition.2017. [13]FINN C,ABBEEL P,LEVINE S.Model-agnostic meta-learning for fast adaptation of deep networks[M].arXiv:1703.03400,2017. [14]ANDRYCHOWIEZ M,DENIL M,GOMEZ S,et al.Learning to learn by gradient descent by gradient descent[C]∥Advances in Neural Information Processing Systems.2016:3981-3989. [15]CEALLE S,MANINIS K,PONTTUEST J,et al.One-Shot Video Object Segmentation[C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017. [16]HOFFE E,AILON N.Deep metric learning using triplet net- work[M].In International Workshop on Similarity-Based Pattern Recognition,2015. [17]HERRMANS A,BEYER L,LEIBE B.In defense of the triplet loss for person re-identification[J].arXiv:1703.07737,2017. [18]GENG M,WANG Y,XIANG T,et al.Deep transfer learning for person re-identification[J].arXiv:1611.05244,2016. [19]LI Y,LI Y,YAN H.Deep joint discriminative learning for vehicle re-identification and retrieval[C]∥IEEE International Conference on Image Processing.IEEE,2017:395-399. [20]CHEN J,NGO C W.Deep-based ingredient recognition for cooking recipe retrieval[C]∥Proceedings of the ACM International Conference on Multimedia.2016:32-41. [21]CHEN X,ZHOU H,ZHU Y,et al.Chinesefoodnet:A largescale image dataset for chinese food recognition[J].arXiv:1705.02743,2017. [22]MIN W Q,JIANG S Q,LIU L H,et al.A Survey on food computing[J/OL].https://arxiv.org/abs/1808.07202?context=cs.mm [23]KAWANO Y,YANAI K.Food image recognition with deep convolutional features[C]∥Proceedings of the ACM International Joint Conference on Pervasive and Ubiquitous Computing:Adjunct Publication.2014:589-593. [24]KAGAYA H,AIZAWA K,OGAWA M.Food detection and recognition using convolutional neural network[C]∥Procee-dings of the ACM International Conference on Multimedia.2014:1085-1088. [25]XU R,HERRANZ L,JIANG S Q.Geolocalized Modeling for Dish Recognition[J].IEEE Transactions on Multimedia,2015,17(8):1187-1199. [26]MIN W Q,JIANG S Q,SANG J T,et al.Being a super cook:Joint food attributes and multimodal content modeling for recipe retrieval and exploration[J].IEEE Transactions on Multimedia,2017(5):1100-1113. [27]MIN W Q,BAO B K,MEI S H,et al.You are what you eat:Exploring rich recipe information for cross-region food analysis[J].IEEE Transactions on Multimedia,2017,20(4):950-964. [28]WANG H,MIN W,LI X,et al.Where and what to eat:Simultaneous restaurant and dish recognition from food image[C]∥Pacific Rim Conference on Multimedia.2016:520-528. [29]MEI S H,MIN W Q,LIU L H.Faster R-CNN based food image retrieval and classification [J].Journal of Nanjing University of Information Science & Technology (Natural Science Edition),2017,9(6):635-641. [30]SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014. [31]KINGMA D,BA J.Adam:A method for stochastic optimization[C]∥arXiv:1412.6980.2014. [32]MENG Y,GUO Y.Deep Triplet Ranking Networks for One- Shot Recognition[J].arXiv:1804.07275,2018. |
[1] | 陈志强, 韩萌, 李慕航, 武红鑫, 张喜龙. 数据流概念漂移处理方法研究综述 Survey of Concept Drift Handling Methods in Data Streams 计算机科学, 2022, 49(9): 14-32. https://doi.org/10.11896/jsjkx.210700112 |
[2] | 王明, 武文芳, 王大玲, 冯时, 张一飞. 生成链接树:一种高数据真实性的反事实解释生成方法 Generative Link Tree:A Counterfactual Explanation Generation Approach with High Data Fidelity 计算机科学, 2022, 49(9): 33-40. https://doi.org/10.11896/jsjkx.220300158 |
[3] | 张佳, 董守斌. 基于评论方面级用户偏好迁移的跨领域推荐算法 Cross-domain Recommendation Based on Review Aspect-level User Preference Transfer 计算机科学, 2022, 49(9): 41-47. https://doi.org/10.11896/jsjkx.220200131 |
[4] | 周芳泉, 成卫青. 基于全局增强图神经网络的序列推荐 Sequence Recommendation Based on Global Enhanced Graph Neural Network 计算机科学, 2022, 49(9): 55-63. https://doi.org/10.11896/jsjkx.210700085 |
[5] | 宋杰, 梁美玉, 薛哲, 杜军平, 寇菲菲. 基于无监督集群级的科技论文异质图节点表示学习方法 Scientific Paper Heterogeneous Graph Node Representation Learning Method Based onUnsupervised Clustering Level 计算机科学, 2022, 49(9): 64-69. https://doi.org/10.11896/jsjkx.220500196 |
[6] | 柴慧敏, 张勇, 方敏. 基于特征相似度聚类的空中目标分群方法 Aerial Target Grouping Method Based on Feature Similarity Clustering 计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203 |
[7] | 郑文萍, 刘美麟, 杨贵. 一种基于节点稳定性和邻域相似性的社区发现算法 Community Detection Algorithm Based on Node Stability and Neighbor Similarity 计算机科学, 2022, 49(9): 83-91. https://doi.org/10.11896/jsjkx.220400146 |
[8] | 吕晓锋, 赵书良, 高恒达, 武永亮, 张宝奇. 基于异质信息网的短文本特征扩充方法 Short Texts Feautre Enrichment Method Based on Heterogeneous Information Network 计算机科学, 2022, 49(9): 92-100. https://doi.org/10.11896/jsjkx.210700241 |
[9] | 徐天慧, 郭强, 张彩明. 基于全变分比分隔距离的时序数据异常检测 Time Series Data Anomaly Detection Based on Total Variation Ratio Separation Distance 计算机科学, 2022, 49(9): 101-110. https://doi.org/10.11896/jsjkx.210600174 |
[10] | 聂秀山, 潘嘉男, 谭智方, 刘新放, 郭杰, 尹义龙. 基于自然语言的视频片段定位综述 Overview of Natural Language Video Localization 计算机科学, 2022, 49(9): 111-122. https://doi.org/10.11896/jsjkx.220500130 |
[11] | 曹晓雯, 梁美玉, 鲁康康. 基于细粒度语义推理的跨媒体双路对抗哈希学习模型 Fine-grained Semantic Reasoning Based Cross-media Dual-way Adversarial Hashing Learning Model 计算机科学, 2022, 49(9): 123-131. https://doi.org/10.11896/jsjkx.220600011 |
[12] | 周旭, 钱胜胜, 李章明, 方全, 徐常胜. 基于对偶变分多模态注意力网络的不完备社会事件分类方法 Dual Variational Multi-modal Attention Network for Incomplete Social Event Classification 计算机科学, 2022, 49(9): 132-138. https://doi.org/10.11896/jsjkx.220600022 |
[13] | 戴禹, 许林峰. 基于文本行匹配的跨图文本阅读方法 Cross-image Text Reading Method Based on Text Line Matching 计算机科学, 2022, 49(9): 139-145. https://doi.org/10.11896/jsjkx.220600032 |
[14] | 曲倩文, 车啸平, 曲晨鑫, 李瑾如. 基于信息感知的虚拟现实用户临场感研究 Study on Information Perception Based User Presence in Virtual Reality 计算机科学, 2022, 49(9): 146-154. https://doi.org/10.11896/jsjkx.220500200 |
[15] | 周乐员, 张剑华, 袁甜甜, 陈胜勇. 多层注意力机制融合的序列到序列中国连续手语识别和翻译 Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion 计算机科学, 2022, 49(9): 155-161. https://doi.org/10.11896/jsjkx.210800026 |
|