Computer Science ›› 2013, Vol. 40 ›› Issue (7): 1-8.

    Next Articles

Review on Image Content Representation Models

ZHANG Lin-bo,XIAO Bai-hua,WANG Feng and SHI Lei   

  • Online:2018-11-16 Published:2018-11-16

Abstract: Content-based image representation has become one of the most popular problem in the field of computer vision.To deal with the challenge of object deformation,object occlusion,scale variability and background confusion,a lot of strategies have been proposed,and a large number of works have been produced.This paper presented a review on the classic works related to content-based image representation,which are in the order of codebook-based models,part-structure models,contour-fragment-based models,biological-cognition-related models and context related models.In addition,the advantages and limitations of each model were also provided.

Key words: Image classification,Content-based,Representation model,Review

[1] Oliva A,Torralba A.Modeling the shape of the scene:A holistic representation of the spatial envelope[J].International Journal of Computer Vision,2001,42(3):145-175
[2] Swain M J,Ballard D H.Color indexing [J].International Journal of Computer Vision,1991,7(1):11-32
[3] Freeman W T,Adelson E H.The design and use of steerable filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,13(9):891-906
[4] Li J,Allinson N.A comprehensive review of current local features for computer vision[J].Neurocomputing,2008,71(10-12):1771-1787
[5] Scharalitzky F,Zisserman A.Multi-view Matching for Unorder-ed Image Sets,or “How Do I Organize My Holiday Snaps?” [C]∥Proceedings of the 7th European Conference on Computer Vision.London,UK,2002:414-431
[6] Dalal N,Triggs B.Histograms of oriented gradients for human detection[C]∥Conference on Computer Vision and Pattern Re-cognition. June 2005:886-893
[7] Ojala T,Pietikainen M,Maenpaa T.Multiresolution gray-scale and rotation invariant texture classication with local binary patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(7):971-987
[8] Joachims T.Text categorization with support vector machines:learning with many relevant features[C]∥Proceedings of 10th European Conference on Machine Learning.No.1398,Chemnitz,DE,1998:137-142
[9] Zhu L,Rao A B,Zhang A D.Theory of keyblock-based image retrieval[J].ACM Transactions on Information Systems,2002,20:224-257
[10] Csurka G,Dance C R,Fan L X,et al.Visual categorization with bags of keypoints[C]∥Workshop on Statistical Learning in Computer Vision.ECCV,2004:1-22
[11] Vogel J,Schiele B.A semantic typicality measure for naturalscene categorization[C]∥Pattern Recognition Symposium.DAGM,2004
[12] Hofmann T.Probabilistic latent semantic analysis[C]∥Pro-ceedings of Uncertainty in Articial Intelligence.UAI,Stockholm,1999
[13] Hofmann T.Unsupervised learning by probabilistic latent se-mantic analysis[J].Machine Learning,2001,42(1/2):177-196
[14] Blei D M,Ng A Y,Jordan M I.Latent dirichlet allocation[J].Journal of Machine Learning Research,2003,3(4/5):993-1022
[15] Sivic J,Russell B C,Efros A A,et al.Discovering objects andtheir location in images[C]∥International Conference on Computer Vision.volume 1,2005:370-377
[16] Li F F,Perona P.A bayesian hierarchical model for learning natural scene categories[C]∥Conference on Computer Vision and Pattern Recognition.vol.2,June 2005:524-531
[17] Rasiwasia N,Vasconcelos N.Scene classification with low-di-mensional semantic spaces and weak supervision[C]∥Confe-rence on Computer Vision and Pattern Recognition.2008:1-6
[18] Fritz M,Schiele B.Decomposition,discovery and detection ofvisual categories using topic models[C]∥Conference on Computer Vision and Pattern Recognition.2008:1-8
[19] Bosch A,Zisserman A,Munoz X.Scene classification via plsa[C]∥Proceedings of the European Conference on Computer Vision.2006
[20] Quelhas P,Monay F,Odobez J M,et al.Modeling scenes with local descriptors and latent aspects[C]∥Tenth International Conference on Computer Vision.2005:883-890
[21] Liu J G,Shah M.Scene modeling using co-clustering[C]∥11th International Conference on Computer Vision.2007:1-7
[22] Lazebnik S,Schmid C,Ponce J.Beyond bags of features:Spatialpyramid matching for recognizing natural scene categories[C]∥Conference on Computer Vision and Pattern Recognition.vol 2,2006:2169-2178
[23] 王宇新,郭禾,何昌钦,等.用于图像场景分类的空间视觉词袋模型[J].计算机科学,2011,38(8):265-268
[24] 张琳波,王春恒,肖柏华,等.基于Bag-of-phrases的图像表示方法[J].自动化学报,2012,38(1):46-54
[25] Olshausen B A,Field D J.Sparse coding of sensory inputs[J].Current opinion in neurobiology,2004,14(4):481-487
[26] Wright J,Yang A Y,Ganesh A,et al.Robust face recognition via sparse representation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,31(2):210-227
[27] Yang J C,Yu K,Gong Y H,et al.Linear spatial pyramid matching using sparse coding for image classification[C]∥Confe-rence on Computer Vision and Pattern Recognition.2009:1794-1801
[28] Chatfield K,Lempitsky V,Vedaldi A,et al.The devil is in the details:an evaluation of recent feature encoding methods[C]∥British Machine Vision Conference.2011
[29] Fischler M A,Elschlager R A.The representation and matching of pictorial structures[J].IEEE Transactions on Computers,1973,C-22(1):67-92
[30] Burl M C,Perona P.Recognition of planar object classes[C]∥Conference on Computer Vision and Pattern Recognition.1996:223-230
[31] Burl M C,Weber M,Perona P.A probabilistic approach to object recognition using local photometry and global geometry[C]∥Proceedings of European Conference Computer Vision.1998
[32] Leung T K,Burl M C,Perona P.Probabilistic affine invariants for recognition[C]∥Conference on Computer Vision and Pattern Recognition.1998:678-684
[33] Weber M,Welling M,Perona P.Unsupervised learning of mo-dels for recognition[C]∥Proceedings of European Conference Computer Vision Dublin.Ireland,2000
[34] Fergus R,Perona P,Zisserman A.Object class recognition by unsupervised scale-invariant learning[C]∥Conference on Computer Vision and Pattern Recognition.vol 2,2003:264-271
[35] Fergus R,Perona P,Zisserman A.A sparse object category mo-del for efficient learning and exhaustive recognition[C]∥Confe-rence on Computer Vision and Pattern Recognition.vol 1,2005:380-387
[36] Li F F,Fergus R,Perona P.A bayesian approach to unsuper-vised one-shot learning of object categories[C]∥International Conference on Computer Vision.vol.2,2003:1134-1141
[37] Li F F,Fergus R,Perona P.Learning generative visual models from few training examples:An incremental bayesian approach tested on 101object categories[C]∥Conference on Computer Vision and Pattern Recognition Workshop.2004:178
[38] Crandall D,Felzenszwalb P,Huttenlocher D.Spatial priors for part-based recognition using statistical models[C]∥Conference on Computer Vision and Pattern Recognition.vol.1,2005:10-17
[39] Felzenszwalb P F,Huttenlocher D P.Efficient matching of pictorial structures[C]∥Conference on Computer Vision and Pattern Recognition.vol.2,2000:66-73
[40] Felzenszwalb P F,Huttenlocher D P.Pictorial structures for object recognition[J].International Journal of Computer Vision,2005,61(1):55-79
[41] Felzenszwalb P,McAllester D,Ramanan D.A discriminativelytrained,multiscale,deformable part model[C]∥Conference on Computer Vision and Pattern Recognition.2008:1-8
[42] Felzenszwalb P F.Representation and detection of deformableshapes[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(2):208-220
[43] Felzenszwalb P F,Schwartz J D.Hierarchical matching of deformable shapes[C]∥Conference on Computer Vision and Pattern Recognition.2007:1-8
[44] Biederman I.Surface versus edge-based determinants of visualrecognition[J].Cognitive Psychology,1988,20(1):38-64
[45] De W J,Wagemans J.Contour-based object identification andsegmentation:stimuli,norms and data,and software tools[J].Behavior Research Methods,Instruments,and Computers,2004,36(4):604-624
[46] Huttenlocher D P,Rucklidge W J.A multi-resolution technique for comparing images using the hausdor distance[C]∥Confe-rence on Computer Vision and Pattern Recognition.1993:705-706
[47] Yuille A L,Cohen D S,Hallinan P W.Feature extraction from faces using deformable templates[C]∥IEEE Computer Society Conference on Computer Vision and Pattern Recognition.1989:104-109
[48] Tomaso J,Poggio M.Model-based matching by linear combinations of prototypes[R].MIT Series/Report no.AIM-1583,CBCL-139.December 1996
[49] Ullman S.Three-dimensional object recognition based on thecombination of views[J].Cognition,1998,67(1/2):21-44
[50] Leibe B,Leonardis A,Schiele B.Combined object categorization and segmentation with an implicit shape model[C]∥ECCV workshop on statistical learning in computer vision.2004:17-32
[51] Leibe B,Leonardis A,Schiele B.Robust object detection with interleaved categorization and segmentation[J].International Journal of Computer Vision,2008,77(1):259-289
[52] Shotton J,Blake A,Cipolla R.Contour-based learning for object detection[C]∥Tenth IEEE International Conference on Computer Vision.vol.1,2005:503-510
[53] Opelt A,Zisserman A.A boundary-fragment-model for objectdetection[C]∥European Conference on Computer Vision.2006:575-588
[54] Opelt A,Pinz A,Zisserman A.Incremental learning of object detectors using a visual shape alphabet[C]∥Conference on Computer Vision and Pattern Recognition,vol.1,2006:3-10
[55] Shotton J,Blake A,Cipolla R.Multiscale categorical object recognition using contour fragments[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,30(7):1270-1281
[56] Leibe B,Schiele B.Interleaved object categorization and segmentation[C]∥Proceedings of The British Machine Vision Confe-rence.2003:759-768
[57] Borenstein E,Ullman S.Combined top-down and bottom-up segmentation[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2008,30(12):2109-2125
[58] Kumar M P,Ton P H S,Zisserman A.Obj cut[C]∥IEEE Conference on Computer Vision and Pattern Recognition.vol.1,2005:18-25
[59] Kindermann R,Snell J L.Markov Random Fields and Their Applications.Contemporary Mathematics[M].American Mathematical Society,Providence,Rhode Island,1980
[60] He X M,Zemel R S,Carreira-Perpinan M A.Multiscale conditional random fields for image labeling[C]∥Proceedings of Conference on Computer Vision and Pattern Recognition.2004
[61] Shotton J,Winn J,Rother C,et al.Texton boost for image understanding:Multi-class object recognition and segmentation by jointly modeling texture,layout,and context[J].International Journal of Computer Vision,2009,81:2-23
[62] Serre T,Wolf L,Bileschi S.Robust object recognition with cortex-like mechanisms[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2007,29(3):411-426
[63] Serre T,Wolf L,Poggio T.Object recognition with features inspired by visual cortex[C]∥Conference on Computer Vision and Pattern Recognition.vol.2,2005:994-1000
[64] Rutishauser U,Walther D,Koch C,et al.Is bottom-up attention useful for object recognition? [C]∥Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.2004
[65] Gaussier P,Cocquerez J-P.Neural networks for complex scene recognition:simulation of a visual system with several cortical areas[C]∥International Joint Conference on Neural Networks.1992:233-259
[66] Riesenhuber M.Appearance isn’t everything:News on objectrepresentation in cortex[J].Neuron,2007,55(3):341-344
[67] Hopfinger J B,Buonocore M H,Mangun G R.The neural me-chanisms of top-down attentional control[J].Nature Neuroscience,2000,3(3):284-291
[68] Kincade J M,Ollinger J M,McAvoy M P,et al.Voluntary orien-ting is dissociated from target detection in human posterior parie-tal cortex[J].Nature Neuroscience,2000,3(3):292-297
[69] Itti L,Koch C.Computational modelling of visual attention[J].Nature reviews.Neuroscience,2001,2(3):194-203
[70] Treisman A M,Gelade G.A feature-integration theory of attention[J].Cognitive Psychology,1980,12(1):97-136
[71] Reynolds J H,Desimone R.The role of neural mechanisms of attention in solving the binding problem[J].Neuron,1999,24(1):19-29
[72] Guigon E,Grandguillaume P,Otto I,et al.Neural network mo-dels of cortical functions based on the computational properties of the cerebral cortex[J].Journal of Physiology-Paris,1994,88(5):291-308
[73] Schill K,Umkehrer E,Beinlich S.Scene analysis with saccadic eye movements:top-down and bottom-up modeling[J].Journal of Electronic Imaging,2001,10:152-160
[74] Deco G,Zihl J.A neurodynamical model of visual attention:Feedback enhancement of spatial resolution in a hierarchical system[J].Journal of Computational Neuroscience,2001,10(3):231-253
[75] Hubel D H,Wiesel T N.Receptive fields,binocular interaction and functional architecture in the cat’s visual cortex[J].The Journal of physiology,1962,160:106-154
[76] Gabor D.Theory of communication[J].Journal of the Institution of Electrical Engineers,1946,93:429-457
[77] Riesenhuber M,Poggio T.Hierarchical models of object recognition in cortex[J].Nature Neuroscience,1999,2(11):1019-1025
[78] Perrett D I,Hietanen J K,Oram M W,et al.Organization and functions of cells responsive to faces in the temporal cortex[J].Philosophical Transactions:Biological Sciences,1992,335(1273):23-30
[79] Hung C P,Kreiman G,Poggio T.Fast readout of object identity from macaque inferior temporal cortex[J].Science,2005,310(5749):863-866
[80] Olshausen B A,Field D J.Emergence of simple-cell receptivefield properties by learning a sparse code for natural images[J].Nature,1996,381(6583):607-609
[81] Mutch J,Lowe D.Object class recognition and localization using sparse features with limited receptive fields[J].International Journal of Computer Vision,2008,80(1):45-57
[82] Poggio T B,Stanley M.A multi-scale generalization of the hog and hmax image descriptors for object detection[R].MIT-CSAIL-TR-2008-019,CBCL-271.Massachusetts Institute of Technology,Apr.2008
[83] Bullier J.Integrated model of visual processing[J].Brain re-search.Brain research reviews,2001,36(2/3):96-107
[84] Hochstein S,Ahissar M,Neurobiology D O.View from the top:Hierarchies and reverse hierarchies in the visual system[J].Neuron,2002,36:791-804
[85] Belongie S,Malik J,Puzicha J.Shape matching and object recognition using shape contexts[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,24(4):509-522
[86] Torralba A,Murphy K P,Freeman W T.Contextual models for object detection using boosted random fields[C]∥Advances in Neural Information Processing Systems,NIPS.2005:1401-1408
[87] Biederman I,Mezzanotte R J,Rabinowitz J C.Scene perception:Detecting and judging objects undergoing relational violations[J].Cognitive Psychology,1982,14(2):143-177
[88] Cox D,Meyers E,Sinha P.Contextually evoked objectspecific responses in human visual cortex[J].Science,2004,304(5667):115-117
[89] Boshyan J,Fenske M,Gronauo N,et al.The contribution of context to visual object recognition[J].Journal of Vision,2005,5(8)
[90] Murphy K,Torralba A,Freeman W T.Using the Forest to See the Trees:A Graphical Model For Recognizing Scenes and Objects[M]∥Sebastian Thrun,Lawrence Saul,and Bernhard schAolkopf,eds.Advances in Neural Information Processing Systems.Cambridge,MA:MIT Press,2004
[91] Carbonetto P,de Freitas N,Barnard K.A statistical model for general contextual object recognition[C]∥Proceedings of the European Conference on Computer Vision.2004
[92] Singhal A,Luo J,Zhu W Y.Probabilistic spatial context models for scene content understanding[C]∥IEEE Conference on Computer Vision and Pattern Recognition.vol.1,2003:235-241
[93] Torralba A.Contextual influences on saliency[Z].Neurobiology of attention.2005:586-592
[94] Wolf L,Bileschi S.A critical view of context[J].International Journal of Computer Vision,2006,69(2):251-261
[95] Kumar S,Hebert M.Discriminative Random Fields:A Discrimi-native Framework for Contextual Interaction in Classification[C]∥Proceedings of the 9th International Conference on Computer Vision.Washington,DC,USA,2003:1150
[96] From appearance to context-based recognition:Dense labeling in small images.June 2008
[97] Tu Z W.Auto-context and its application to high-level visiontasks[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8
[98] He X M,Zemel R S.Latent topic random fields:Learning usinga taxonomy of labels[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8
[99] Torralba A.Contextual priming for object detection[J].International Journal of Computer Vision,2003,53(2):169-191
[100] Mei T,Wang Y,Hua X S,et al.Coherent image annotation by learning semantic distance[C]∥IEEE Conference on Computer Vision and Pattern Recognition.2008:1-8

No related articles found!
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!