计算机科学 ›› 2016, Vol. 43 ›› Issue (5): 269-273.doi: 10.11896/j.issn.1002-137X.2016.05.051

• 图形图像与模式识别 • 上一篇    下一篇

基于梯度核特征及N-gram模型的商品图像句子标注

张红斌,姬东鸿,尹兰,任亚峰   

  1. 武汉大学计算机学院 武汉430072,武汉大学计算机学院 武汉430072,武汉大学计算机学院 武汉430072,武汉大学计算机学院 武汉430072
  • 出版日期:2018-12-01 发布日期:2018-12-01
  • 基金资助:
    本文受国家自然科学基金重点项目(61133012),国家社科重大招标项目(11&ZD189),教育部人文社会科学研究青年项目(12YJCZH274),江西省科技厅科技攻关项目(20142BBG70011,20121BBG70050),江西省高校人文社科基金项目(XW1502,TQ1503)资助

Product Image Sentence Annotation Based on Gradient Kernel Feature and N-gram Model

ZHANG Hong-bin, JI Dong-hong, YIN Lan and REN Ya-feng   

  • Online:2018-12-01 Published:2018-12-01

摘要: 提出为商品图像标注句子,以便更准确地刻画图像内容。首先,执行图像特征学习,选出标注性能最优的梯度核特征完成图像分类和图像检索,该特征能客观描绘商品图像中形状和纹理这两类关键视觉特性。然后,基于语义相关度计算结果从训练图像的文本描述中摘取关键单词,并采用N-gram模型把单词组装为蕴涵丰富语义信息且满足句法模式兼容性的修饰性短语,基于句子模板和修饰性短语生成句子。最后,构建Boosting模型,从若干标注结果中选取BLEU-3评分最优的句子标注商品图像。结果表明,Boosting模型的标注性能优于各基线。

关键词: 梯度核特征,N-gram模型,商品图像,句子标注,语义相关度计算,修饰性短语

Abstract: Product image sentence annotation was presented because sentence describes online products more accurately than single words.Firstly,image feature learning was executed.Gradient kernel feature that achieves the best annotation performance was chosen because the feature describes the key visual characteristics of product image such as shape and texture better than other features.Therefore,the gradient kernel feature was selected to complete image classification and image retrieval.Secondly,several key words were summarized from training images’ captions based on semantic correlation computing.Thirdly,a modified sequence that not only contains rich semantic information but also satisfies syntactic mode compatibility was created based on these key words by N-gram model.Sentence was generated according to predefined sentence template and the modified sequence.Finally,a Boosting model was designed to choose those sentences that obtain the best BLEU-3 scores to annotate product images.Experiments show sentences generated by the boosting model achieve the state of art annotation performances.

Key words: Gradient kernel feature,N-gram model,Product image,Sentence annotation,Semantic correlation computing,Modified sequence

[1] Makadia A,Pavlovic V,Kumar S.A New Baseline for ImageAnnotation[C]∥Proceedings of European Conference on Computer Vision.2008:316-329
[2] Yang Y,Teo C L,Daume H,et al.Corpus-guided sentencegeneration of natural images[C]∥Proceedings of Conference on Empirical Methods on Natural Language Processing.2011:444-454
[3] Kulkarni G,Premraj V,Dhar S,et al.Baby talk:Understanding and generating simple image descriptions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(12):2891-2903
[4] Nwogu I,Zhou Ying-bo,Brown C.DISCO:Describing ImagesUsing Scene Contexts and Objects[C]∥Proceedings of American Association for Artificial Intelligence.2011:1487-1493
[5] Hodosh M,Young P,Hockenmaier J.Framing image description as a ranking task:Data,models and evaluation metrics[J].J.Artif.Intell.Res.(JAIR),2013(47):853-899
[6] Li Pi-ji,Ma Jun,Gao Shuai.Learning to Summarize Web Image and Text Mutually[C]∥Proceedings of International Con-ference on Multimedia Retrieval.2012
[7] Feng Y S,Lapata M.Automatic Caption Generation for News Images [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(4):797-812
[8] Berg T L,Berg A C,Shih J.Automatic Attribute Discovery and Characterization from Noisy Web Data [C]∥Proceedings of European Conference on Computer Vision.2010:663-676
[9] Kiapour H,Yamaguchi K,Berg A C,et al.Hipster Wars:Discovering Elements of Fashion Styles[C]∥Proceedings of European Conference on Computer Vision.2014:472-488
[10] Rebecca.Domain-Independent Captioning of Domain-SpecificImages [C]∥Proceedings of North American Association for Computational Linguistics.2013:69-76
[11] Kiros R,Zemel R S,Salakhutdinov R.Multimodal Neural Language Models[C]∥Proceedings of International Conference on Machine Learning.2014
[12] Bo L,Ren X,Fox D.Kernel Descriptors for Visual Recognition[C]∥Proceedings of Advances in Neural Information Proces-sing Systems.2010:1734-1742
[13] Bo L,Ren X,Fox D.Efficient Match Kernels between Sets of Features for Visual Recognition [C]∥Proceedings of Advances in Neural Information Processing Systems.2009:135-143
[14] Sivaram G,Hermansky H.Sparse Multilayer Perceptron forPhoneme Recognition[J].IEEE Trans.Audio,Speech,& Language Proc.,2012,20(1):23-29
[15] Wang J,Yang J,Yu K,et al.Locality-constrained linear coding for image classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2010:3360-3367
[16] Papineni K,Roukos S,Ward T,et al.Bleu:a method for automatic evaluation of machine translation[C]∥Proceedings of the Annual Meeting on Association for Computational Linguistics.2002:311-318
[17] Gao Sheng-hua,Tsang W-H, Chia L T.Sparse Representation With Kernels[J].IEEE Transactions on Image Processing,2013,22(2):423-434
[18] Maas A L,Daly R E,Pham P T,et al.Learning Word Vectors for Sentiment Analysis[C]∥Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies.2011:142-150

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!