基于卷积神经网络的多标签图像自动标注

doi:10.11896/j.issn.1002-137X.2016.07.006

计算机科学 ›› 2016, Vol. 43 ›› Issue (7): 41-45.doi: 10.11896/j.issn.1002-137X.2016.07.006

• 2015年第二十四届全国多媒体学术会议 • 上一篇下一篇

基于卷积神经网络的多标签图像自动标注

黎健成,袁春,宋友

清华大学计算机科学与技术系北京100084;北京航空航天大学软件学院北京100191,清华大学计算机科学与技术系北京100084,北京航空航天大学软件学院北京100191

出版日期:2018-12-01 发布日期:2018-12-01
基金资助:
本文受国家自然科学基金(U1433112,3),国家核高基项目(2013ZX01039001-002),国家高技术研究发展计划(“863计划”)(2011AA01A205),清华-腾讯合作项目(人体虚拟形象建模)资助

Multi-label Image Annotation Based on Convolutional Neural Network

LI Jian-cheng, YUAN Chun and SONG You

Online:2018-12-01 Published:2018-12-01

摘要/Abstract

摘要： 如今生活中,图像资源无处不在,海量的图像让人应接不暇。如何快速有效地对这些图像信息进行查询、检索和组织,成为了当前亟需解决的热门问题。而图像自动标注是解决基于文本的图像检索的关键。文中提出的这套基于深度学习模型中的卷积神经网络模型的多标签图像自动标注系统,实现了多标签损失排名函数,完成了多标签数据的训练与测试。在实验验证上,先选取CIFAR-10数据集进行算法的有效性测试,然后选取多标签图像数据集Corel 5k进行定量测试比较,结果表明,该算法的综合性能指标与现有算法相比有较大的提升。

关键词: 图像自动标注,多标签,深度学习,卷积神经网络

Abstract: In today’s life,the image resource is almost ubiquitous.An ocean of images make people overwhelmed.How to query,retrieve and organize these image information quickly and effectively is an urgent hot issue.The automatic ima-ge annotation is the key of text-based image retrieval solutional.A multi-label image annotation system based on a well-known deep learning model,convolutional neural network,was proposed in this paper,together with a multi-label loss ranking function to complete,the training and testing of multi-label image dataset.In the experiments,firstly,CIFAR-10 dataset were selected to test the effectiveness of the algorithm,and then quantitative test comparsion was conducted on multi-label image dataset Corel 5k.The proposed solution shows superior performance over the conventional algorithm.

Key words: Image annotation,Multi-label,Deep learning,CNN

黎健成,袁春,宋友. 基于卷积神经网络的多标签图像自动标注[J]. 计算机科学, 2016, 43(7): 41-45. https://doi.org/10.11896/j.issn.1002-137X.2016.07.006

LI Jian-cheng, YUAN Chun and SONG You. Multi-label Image Annotation Based on Convolutional Neural Network[J]. Computer Science, 2016, 43(7): 41-45. https://doi.org/10.11896/j.issn.1002-137X.2016.07.006

参考文献

[1] Duygulu P,Barnard K,de Freitas J F G,et al.Object recognition as machine translation:Learning a lexicon for a fixed image vocabulary[C]∥Computer Vision(ECCV 2002).Springer Berlin Heidelberg,2002:97-112
[2] Barnard K,Duygulu P,Forsyth D,et al.Matching words and pictures[J].The Journal of Machine Learning Research,2003,3(2):1107-1135
[3] Carneiro G,Chan A B,Moreno P J,et al.Supervised learning of semantic classes for image annotation and retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(3):394-410
[4] Jeon J,Lavrenko V,Manmatha R.Automatic image annotation and retrieval using cross-media relevance models[C]∥Procee-dings of the 26th Annual International ACM SIGIR Conference on Research and Development in Informaion Retrieval.ACM,2003:119-126
[5] Feng S L,Manmatha R,Lavrenko V.Multiple bernoulli rele-vance models for image and video annotation[C]∥Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2004(CVPR 2004).IEEE,2004,2
[6] Grangier D,Bengio S.A discriminative kernel-based approach to rank images from text queries[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,30(8):1371-1384
[7] Makadia A,Pavlovic V,Kumar S.A new baseline for image annotation[M]∥Computer Vision(ECCV 2008).Springer Berlin Heidelberg,2008:316-329
[8] Guillaumin M,Mensink T,Verbeek J,et al.Tagprop:Discriminative metric learning in nearest neighbor models for image auto-annotation[C]∥IEEE 12th International Conference on Computer Vision,2009.IEEE,2009:309-316
[9] Deng J,Dong W,Socher R,et al.Imagenet:A large-scale hierarchical image database[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,2009:248-255
[10] Krizhevsky A,Sutskever I,Hinton G E.Imagenet classification with deep convolutional neural networks[C]∥Advances in Neural Information Processing Systems.2012:1097-1105
[11] Szegedy C,Liu W,Jia Y,et al.Going deeper with convolutions[C]∥CVPR.2015:1-9
[12] He K,Zhang X,Ren S,et al.Delving deep into rectifiers:Surpassing human-level performance on imagenet classification[C]∥ICCV.2015:1026-1034
[13] Jia Y,Shelhamer E,Donahue J,et al.Caffe:Convolutional architecture for fast feature embedding[C]∥Proceedings of the ACM International Conference on Multimedia.ACM,2014:675-678
[14] Duygulu P,Barnard K,de Freitas J F G,et al.Object recognition as machine translation:Learning a lexicon for a fixed image vocabulary[M]∥Computer Vision(ECCV 2002).Springer Berlin Heidelberg,2002:97-112
[15] Metzler D,Manmatha R.An inference network approach to ima-ge retrieval[M]∥Image and video retrieval.Springer Berlin Heidelberg,2004:42-50
[16] Yavlinsky A,Schofield E,Rüger S.Automated image annotation using global features and robust nonparametric density estimation[M]∥Image and video retrieval.Springer Berlin Heidelberg,2005:507-517
[17] Carneiro G,Chan A B,Moreno P J,et al.Supervised learning of semantic classes for image annotation and retrieval[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(3):394-410
[18] Liu J,Li M,Liu Q,et al.Image annotation via graph learning[J].Pattern recognition,2009,42(2):218-228
[19] Zhang S,Huang J,Huang Y,et al.Automatic image annotation using group sparsity[C]∥2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2010:3312-3319

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

基于卷积神经网络的多标签图像自动标注

Multi-label Image Annotation Based on Convolutional Neural Network

PDF (PC)

摘要/Abstract

引用本文

使用本文

参考文献

相关文章 0

Metrics

本文评价

推荐阅读 0