Computer Science ›› 2016, Vol. 43 ›› Issue (5): 269-273.doi: 10.11896/j.issn.1002-137X.2016.05.051

Previous Articles     Next Articles

Product Image Sentence Annotation Based on Gradient Kernel Feature and N-gram Model

ZHANG Hong-bin, JI Dong-hong, YIN Lan and REN Ya-feng   

  • Online:2018-12-01 Published:2018-12-01

Abstract: Product image sentence annotation was presented because sentence describes online products more accurately than single words.Firstly,image feature learning was executed.Gradient kernel feature that achieves the best annotation performance was chosen because the feature describes the key visual characteristics of product image such as shape and texture better than other features.Therefore,the gradient kernel feature was selected to complete image classification and image retrieval.Secondly,several key words were summarized from training images’ captions based on semantic correlation computing.Thirdly,a modified sequence that not only contains rich semantic information but also satisfies syntactic mode compatibility was created based on these key words by N-gram model.Sentence was generated according to predefined sentence template and the modified sequence.Finally,a Boosting model was designed to choose those sentences that obtain the best BLEU-3 scores to annotate product images.Experiments show sentences generated by the boosting model achieve the state of art annotation performances.

Key words: Gradient kernel feature,N-gram model,Product image,Sentence annotation,Semantic correlation computing,Modified sequence

[1] Makadia A,Pavlovic V,Kumar S.A New Baseline for ImageAnnotation[C]∥Proceedings of European Conference on Computer Vision.2008:316-329
[2] Yang Y,Teo C L,Daume H,et al.Corpus-guided sentencegeneration of natural images[C]∥Proceedings of Conference on Empirical Methods on Natural Language Processing.2011:444-454
[3] Kulkarni G,Premraj V,Dhar S,et al.Baby talk:Understanding and generating simple image descriptions[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(12):2891-2903
[4] Nwogu I,Zhou Ying-bo,Brown C.DISCO:Describing ImagesUsing Scene Contexts and Objects[C]∥Proceedings of American Association for Artificial Intelligence.2011:1487-1493
[5] Hodosh M,Young P,Hockenmaier J.Framing image description as a ranking task:Data,models and evaluation metrics[J].J.Artif.Intell.Res.(JAIR),2013(47):853-899
[6] Li Pi-ji,Ma Jun,Gao Shuai.Learning to Summarize Web Image and Text Mutually[C]∥Proceedings of International Con-ference on Multimedia Retrieval.2012
[7] Feng Y S,Lapata M.Automatic Caption Generation for News Images [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(4):797-812
[8] Berg T L,Berg A C,Shih J.Automatic Attribute Discovery and Characterization from Noisy Web Data [C]∥Proceedings of European Conference on Computer Vision.2010:663-676
[9] Kiapour H,Yamaguchi K,Berg A C,et al.Hipster Wars:Discovering Elements of Fashion Styles[C]∥Proceedings of European Conference on Computer Vision.2014:472-488
[10] Rebecca.Domain-Independent Captioning of Domain-SpecificImages [C]∥Proceedings of North American Association for Computational Linguistics.2013:69-76
[11] Kiros R,Zemel R S,Salakhutdinov R.Multimodal Neural Language Models[C]∥Proceedings of International Conference on Machine Learning.2014
[12] Bo L,Ren X,Fox D.Kernel Descriptors for Visual Recognition[C]∥Proceedings of Advances in Neural Information Proces-sing Systems.2010:1734-1742
[13] Bo L,Ren X,Fox D.Efficient Match Kernels between Sets of Features for Visual Recognition [C]∥Proceedings of Advances in Neural Information Processing Systems.2009:135-143
[14] Sivaram G,Hermansky H.Sparse Multilayer Perceptron forPhoneme Recognition[J].IEEE Trans.Audio,Speech,& Language Proc.,2012,20(1):23-29
[15] Wang J,Yang J,Yu K,et al.Locality-constrained linear coding for image classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.2010:3360-3367
[16] Papineni K,Roukos S,Ward T,et al.Bleu:a method for automatic evaluation of machine translation[C]∥Proceedings of the Annual Meeting on Association for Computational Linguistics.2002:311-318
[17] Gao Sheng-hua,Tsang W-H, Chia L T.Sparse Representation With Kernels[J].IEEE Transactions on Image Processing,2013,22(2):423-434
[18] Maas A L,Daly R E,Pham P T,et al.Learning Word Vectors for Sentiment Analysis[C]∥Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies.2011:142-150

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!