一种融合语义距离的最近邻图像标注方法

doi:10.11896/j.issn.1002-137X.2015.01.066

摘要/Abstract

摘要： 传统的基于最近邻的图像标注方法效果不佳,主要原因在于提取图像视觉特征时,损失了很多有价值的信息。提出了一种改进的最近邻分类模型。首先利用距离测度学习方法,引入图像的语义类别信息进行训练,生成新的语义距离；然后利用该距离对每一类图像进行聚类,生成多个类内的聚类中心；最后通过计算图像到各个聚类中心的语义距离来构建最近邻分类模型。在构建最近邻分类模型的整个过程中,都使用训练得到的语义距离来计算,这可以有效减少相同图像类内的变动和不同图像类之间的相似所造成的语义鸿沟。在ImageCLEF2012图像标注数据库上进行了实验,将本方法与传统分类模型和最新的方法进行了比较,验证了本方法的有效性。

关键词: 图像标注,特征提取,最近邻,距离测度学习,语义距离

Abstract: Most of the nearest neighbor (NN) based image annotation or classification methods do not achieve desired performances.The main reason is that much valuable information is lost when extracting visual features from image.A novel nearest neighbor method was proposed.Firstly,we obtained a new image semantic distance learned by distance metric learning (DML) using image class information,and then multiple clustering centers were formed based on this learned semantic distance.Finally,we constructed our NN model by calculating the distances between the image and these clusters.Our model can minimize the semantic gap for intra-class variations and inter-class similarities.Experimental results on image annotation task of ImageCLEF2012 confirm that our method is efficient and competitive compared with the traditional and state of the art classifiers.

Key words: Image annotation,Feature extraction,Nearest neighbor,Distance metric learning,Semantic distance

吴伟,高光来,聂建云. 一种融合语义距离的最近邻图像标注方法[J]. 计算机科学, 2015, 42(1): 297-302. https://doi.org/10.11896/j.issn.1002-137X.2015.01.066

WU Wei, GAO Guang-lai and NIE Jian-yun. Combination of Nearest Neighbor with Semantic Distance for Image Annotation[J]. Computer Science, 2015, 42(1): 297-302. https://doi.org/10.11896/j.issn.1002-137X.2015.01.066

参考文献

[1] Huiskes M,Thomee B,Lew M.New trends and ideas in visual concept detection[C]∥Proceedings of the 11th ACM Conference on Multimedia Information Retrieval.Philadelphia,PA,USA,2010:527-536
[2] Thomee B,Popescu A.Overview of the ImageCLEF 2012 Flickr Photo Annotation and Retrieval Task[C]∥CLEF 2012 Working notes.Rome,Italy,2012
[3] Deng J,Berg A,Sstheesh S,et al.ImageNet Large Scale Visual Recognition Competition 2012 (ILSVRC2012)∥http://www.image-net.org/challenges/LSVRC/2012/
[4] Everingham M,Gool L,Williams C,et al.The pascal visual object classes (voc) challenge[J].International Journal of Computer Vision,2010,88(2):303-338
[5] Carneiro G,Chan A,Moreno P,et al.Supervised Learining of Semantic Classes for Image Annotation and Retrieval[J].IEEE Transactions on pattern analysis and machine intelligence,2007,29(3):394-410
[6] Jeon J,Lavrenkl V,Manmatha R.Automatic Image Annotation and Retrieval Using Cross-media Relevance Models[C]∥Proceedings of SIGIR.ACM,2003
[7] 刘峥,马军.一种基于图划分和图像搜索引擎的图像标注改善算法[J].计算机研究与发展,2011,8(7):1246-1254
[8] 李志欣,施智平,李志清,等.融合语义主题的图像自动标注[J].软件学报,2011,22(4):801-812
[9] Wang Chang-hu,Zhang Lei,Zhang Hong-jang.Learning to Reduce the Semantic Gap in Web Image Retrieval and Annotation[C]∥Proceedings of SIGIR.ACM,2008:355-362
[10] 许红涛,周向东,向宇,等.一种自适应的Web图像语义自动标注方法[J].软件学报,2010,21(9):2183-2195
[11] Kang F,Jin R,Sukthankar R.Correlated label propagation with application to multi-label learning[C]∥Proceedings of ComputerVision and Pattern Recognition.IEEE,2006:1719-1726
[12] Lin Y,Lv F,Zhu S,et al.Large-scale image classification:fast feature extraction and svm training[C]∥Proceedings of Computer Vision and Pattern Recognition.IEEE,2011:1689-1696
[13] Zhang H,Berg A,Maire M,et al.SVM-KNN:Discriminativenearest neighbor classification for visual category recognition[C]∥Proceedings of Computer Vision and Pattern Recognition.IEEE,2006:2126-2136
[14] Wang Gang,Zhang Ye,Li Fei-fei.Using dependent regions for object categorization in a generative framework[C]∥Procee-dings of Computer Vision and Pattern Recognition.IEEE,2006:1597-1604
[15] Boiman O,Shechtman E,Irani M.In defense of nearest-neighbor based image classification[C]∥Proceedings of Computer Vision and Pattern Recognition.IEEE,2008:1-8
[16] Jurie F,Trigs B.Creating efficient codebooks for visual recognition[C]∥Proceedings of International Conference on Computer Vision.IEEE,2005:604-610
[17] Wu J,Rehg J.Beyond the euclidean distance:Creating effective visual codebooks using the histogram intersection kernel[C]∥Proceedings of International Conference on Computer Vision.IEEE,2009:630-637
[18] Gemert J C,Veenman C J,Smeulders A W,et al.Visual Word Ambiguity[J].IEEE Trans.Pattern Analysis and Machine Intelligence,2010,32(7):1271-1283
[19] Varma M,Ray D.Learning the discriminative power-invariance trade-off[C]∥Proceedings of International Conference on Computer Vision.IEEE,2007
[20] Hazan E,Agarwal A,Kale S.Logarithmic regret algorithms for online convex optimization[J].Machine Learning,2007,69(2/3):169-192
[21] Wang S,Jiang S,Huang Q,et al.Multi-feature metric learning with knowledge transfer among semantics and social tagging[C]∥Proceedings of Computer Vision and Pattern Recognition.IEEE,2012:2240-2247
[22] Deselaers T.Features for image retrieval[R].Rheinisch-Westfalische Technische Hochschule,Technical Report,Aachen,2003
[23] 杨立,左春,王裕国.基于语义距离的K-最近邻分类方法[J].软件学报,2005,16(12):2054-2062
[24] 刘松华,张军英,许进,等.Kernel-kNN:基于信息能度量的核k-最近邻算法[J].自动化学报,2010,6(12):1681-1688
[25] 郭玉堂.基于互K近邻图的自动图像标注与快速求解算法[J].计算机科学,2011,38(2):277-280
[26] 郑君君,李新光,祝一薇,等.海量图像集中K近邻求解的高效算法[C]∥第十五届全国图象图形学学术会议论文集.中国广东广州,2010:417-421
[27] Blitzer J,Weinberger K Q,Saul L K.Distance metric learning for large margin nearest neighbor classification[J].Advances in neural information processing systems,2006,18:1473-1480
[28] Wang Zheng-xiang,Hu Yi-qun,Chia Liang-tien.Image-to-class distance metric learning for image classification[C]∥Computer Vision-ECCV 2010.Springer Berlin Heidelberg,2010:706-719
[29] Wang F,Jiang S,Herranz L,et al.Improving image distancemetric learning by embedding semantic relations[C]∥Advances in Multimedia Information Processing-PCM 2012.Springer Berlin Heidelberg,2012:424-434
[30] Verma Y,Jawahar C V.Image annotation using metric learning in semantic neighbourhoods[C]∥Computer Vision-ECCV 2012.Springer Berlin Heidelberg,2012:836-849
[31] Chatzichristofis S A,Boutalis Y S.FCTH:Fuzzy Color and Texture Histogram-a Low Level Feature for Accurate Image Retrieval[C]∥Proceedings of IEEE 9th International Workshop on Image Analysis for Multimedia Interactive Services.Klagenfurt,Austria,2008
[32] Yang J,Yu K,Gong Y.Linear spatial pyramid matching using sparse coding for image classification[C]∥Proceedings of Computer Vision and Pattern Recognition.IEEE,2009:1794-1801
[33] Jia Y,Huang C,Darrell T.Beyond Spatial Pyramids:Receptive Field Learning for Pooled Image Features[C]∥Proceedings of Computer Vision and Pattern Recognition.IEEE,2012:3370-3377
[34] Zeng Z,Pantic M,Roisman G,et al.A survey of affect recogni-tion methods:audio,visual and spontaneous expressions[J].IEEE Transactions on pattern analysis and machine intelligence,2009,31(1):39-58
[35] Thomas D,Daniel K,Hermann N.Features for image retrieval:an experimental comparison[J].Information Retrieval,2008,11(2):77-107
[36] Zhang Li,Zhou Wei-da.Sparse ensembles using weighted combination methods based on linear programming[J].Pattern Recognition,2011,44(1):97-106

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed