摘要: 设计合适的图像表示是计算机视觉中最重要的问题之一。BoF特征表示方法非常流行,已经广泛应用于图像分类、对象识别、图像检索、机器人定位和纹理识别。BoF特征是将图像表示为无序的特征集合。这种方法虽然缺乏结构信息和空间信息,但概念简洁、计算简单,在某些应用上取得的效果甚至可以与当前最好的方法媲美。仔细研究了BoF模型,着重对BoF模型中的3个阶段:局部特征提取、特征量化和编码、特征汇集所涉及到的典型技术进行了讨论。最后在分析各类研究方法的基础上,总结了目前研究存在的问题及可能的发展方向。
[1] Csurka G,Dance C R,Fan Li-xin,et al.Visual categorizationwith bags of keypoints[C]∥Proceedings of European Confe-rence Computer Vision 2004,workshop on Statistical Learning in Computer Vision,2004.Prague,Czech Republic:Springer-Verlag LNCS,2004:59-74 [2] MacQueen J B.Some Methods for classification and Analysis of Multivariate Observations[C]∥Proceedings of 5-th Berkeley Symposium on Mathematical Statistics and Probability,1967.Berkeley,University of California Press,1967,1:281-297 [3] Arai K,Barakbah A R.Hierarchical K-means:an algorithm for centroids initialization for K-means[J].Reports of the Faculty of Science and Engineering,2007,36(1):25-31 [4] McLachlan G J,Basford K E.Mixture Models:Inference andApplications to Clustering [M].New York:Marcel Dekker,1988 [5] Comaniciu D,Meer P.Mean Shift:A Robust Approach toward Feature Space Analysis[J].IEEE Transactions on Pattern Ana-lysis and Machine Intelligence,2002,24(5):603-619 [6] Beaudet P R.Rotationally invariant image operators[C]∥Proceedings of the 4th International Joint Conference on Pattern Recognition,1978.Kyoto,Japan:Institute of Electrical and Electronics Engineers Inc.,1978:579-583 [7] Hams C,Stephens M.A combined corner and edge detector[C]∥Proceedings of Alvey Vision Conference,1988.University of Manchester,1988:147-151 [8] Smith S M,Brady J M.SUSA N:A new approach to low levelimage processing[J].International Journal of Computer Vision,1997,23(1):45-78 [9] Moravec H.Towards automatic visual obstacle avoidance[C]∥Proceedings of the International Joint Conference on Artificial Intelligence,1977.Cambridge,Massachusetts,USA:Massachusetts Institute of Technology,1977:584 [10] Johnson A,Hebert M.Object recognition by matching oriented points[C]∥Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition,1997.San Juan,Puerto Rico:IEEE Computer Society,1997:684-689 [11] Mikolajczyk K,Schmid C.A Performance Evaluation of LocalDescriptors[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(10):1615-1630 [12] Lindeberg T.Feature Detection with Automatic Scale Selection[J].International Journal of Computer Vision,1998,30(2):79-116 [13] Tuytelaars T,Van Gool L.Matching widely separated viewsbased on affine invariant regions[J].International Journal of Computer Vision,2004,59(1):61-85 [14] Lowe D G.Object recognition from local scale invariant features[C]∥Proceedings of the 7th International Conference on Computer Vision,1999.Kerkyra,Greece:IEEE Computer Society,1999:1150-1157 [15] Mikolajczyk K,Schmid C.Scale&Affine Invariant Interest Point Detectors[J].International Journal of Computer Vision,2004,60(1):63-86 [16] Frintrop S,Rome E,Christensen H I.Computational visual at-tention systems and their cognitive foundations:A survey[J].ACM Transactions on Applied Perception (TAP) ,2010,7(1):1-39 [17] Itti L,Koch C.A saliency-based search mechanism for overt and covert shifts of visual attention[J].Vision Research,2000,40(10-12):1489-1506 [18] Matas J,Chum O,Urban M,et al.Robust Wide Baseline Stereo From Maximally Stable Extremal Regions[C]∥Proceedings of British Machine Vision Conference,2002.British:the British Machine Vision Association,2002:384-393 [19] Kadir T,Zisserman A,Brady M.An Affine Invariant Salient Region Detector[C]∥Proceedings of European Conference on Computer Vision,2004.LNCS,2004,2l:228-241 [20] Mikolajczyk K,Tuytelaars T,Schmid C,et al.A Comparison of Affine Region Detectors[J].International Journal of Computer Vision,2005,65(1/2):43-72 [21] Koch C,Ullman S.Shifts in selective visual attention:towards the underlying neural circuitry[J].Human Neurobiology,1985,4(4):219-27 [22] Lindeberg T.Detecting salient blob-like image structures andtheir scales with a scale-space primal sketch:a method for focus-of-attention[J].International Journal of Computer Vision,1993,11(3):283-318 [23] Lowe D.Distinctive Image Features from Scale-invariant Key-points[J].International Journal of Computer Vision,2004,0(2):91-110 [24] Ke Y,Sukthankar R.Pca-sift:A More Distinctive Representation for Local Image Descriptors[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2004.Washington,DC:IEEE Computer Society,2004:506-513 [25] Dalal N,Triggs B.Histograms of Oriented Gradients for Human Detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2005.San Diego,CA,USA:IEEE Computer Society,2005:886-893 [26] Belongie S,Malik J,Puzicha J.shape matchjng and object recognition using shape contexts[J].IEEE Tran8actions on Pattem Analysis and Machine Intelligence,2002,24(4):509-522 [27] Jiang Y G,Ngo C W,Yang J.Towards optimal bag-of-features for object categorization and semantic video retrieval[C]∥Proceedings of ACM Conference on Image and Video Retrieval,2007.New York,NY,USA:ACM,2007:494-501 [28] van de Sande K E A,Gevers T, Snoek C G M.Evaluating color descriptors for object and scene recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2010,32(9):1582-1596 [29] Freeman W T,Adelson E H.The Design and Use of Steerable Filters[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1991,3(9):891-906 [30] Baumberg A.Reliable feature matching across widely separated views[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2000.Hilton Head,SC,USA:IEEE Computer Society,2000:774-781 [31] Schaffalitzky F,Zisserman A.Multi-view Matching for Unordered Image Sets[C]∥Proceedings of 4th European Conference on Computer Vision,2002.Copenhagen,Denmark:Springer,2002:414-431 [32] Ojala T,Pietikinen M,Menp T.Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2002,4(7):971-987 [33] Hadid A.Face Description with Local Binary Patterns:Application to Face Recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2006,8(12):2037-2041 [34] Song Dong-jin,Tao Da-cheng.Biologically Inspired FeatureManifold for Scene Classification[J].IEEE Transactions on Ima-ge Processing,2010,19(1):174-184 [35] Harada T,Nakayama H,Kuniyoshi Y.Improving Local Descriptors by Embendding Global and Local Spatial Information[C]∥Proceedings of European Conference on Computer Vision,2010.Heraklion,Crete,Greece,2010:736-749 [36] Karlinsky L,Dinerstein M,Ullman S.Unsupervised Feature Optimization (UFO):Simultaneous Selection of Multiple Features with Their Detection Parameters[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2009.Miami,Florida,USA:IEEE Computer Society,2009:1263-1270 [37] Winder S,Brown M.Learning Local Image Descriptors[C]∥Proceedings IEEE Conference on Computer Vision and Pattern Recognition,2007.Minneapolis,Minnesota,USA:IEEE Computer Society,2007:1-8 [38] Winder S,Hua G,Brown M.Picking the best DAISY[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2009.Miami,Florida,USA:IEEE Computer Society,2009:178-185 [39] Coates A,Ng A Y.The Importance of Encoding Versus Trai-ning with Sparse Coding and Vector Quantization[C]∥Procee-dings of the 28th International Conference on Machine Learning 2011.Bellevue,WA,USA,2011 [40] Rigamonti R,Brown M A,Lepetit V.Are Sparse Rrepresenta-tions Really Relevant for Image Classification?[C]∥Procee-dings of IEEE Conference on Computer Vision and Pattern Recognition,2011.Colorado Springs,CO,USA:IEEE Computer Society,2011:1545-1552 [41] Sivic J,Zisserman A.Video google:A Text Retrieval Approach to Object Matching in Videos[C]∥Proceedings of IEEE International Conference on Computer Vision,2003.Nice,France:IEEE Computer Society,2003:1470-1477 [42] Lazebnik S,Raginsky M.Supervised Learning of Quantizer Codebooks by Information Loss Minimization[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,1(7):1294-1309 [43] Mairal J,Bach F,Ponce J,et al.Discriminative learned dictionaries for local image analysis[C]∥Proceedings of IEEE Confe-rence on Computer Vision and Pattern Recognition,2008.Anchorage,Alaska,USA:IEEE Computer Society,2008:1-8 [44] Gemert J C V,Geusebroek J M,Veenman C J,et al.Kernel codebooks for scene categorization[C]∥Proceedings of European Conference on Computer Vision,2008.Marseille,France:Springer,2008:696-709 [45] van Gemert J C,Veenman C J,Smeulders A W M,et al.Visual Word Ambiguity[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2009,32(7):1271-1283 [46] Lee H,Battle A,Raina R,et al.Efficient Ssparse Coding Algorithms[C]∥Proceedings of Advances in Neural Information Processing System,2007.Vancouver,B.C.,Canada:Springer,2007 [47] Yang J,Yu K,Gong Y,et al.Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2009.Miami,Florida,USA:IEEE Computer Society,2009:1794-1801 [48] Gao S,Tsang I,Chia L,et al.Local Features Are Not Lonely-Laplacian Sparse Coding for Image Classification[C]∥Procee-dings of IEEE Conference on Computer Vision and Pattern Recognition,2010.San Francisco,CA,USA:IEEE Computer Society,2010:3555-3561 [49] Wang J,Yang J,Yu K,et al.Locality-constrained linear coding for image classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2010.San Francisco,CA,USA:IEEE Computer Society,2010:3360-3367 [50] Yu K,Zhang T,Gong Y.Nonlinear Learning Using Local Coordinate Coding[C]∥Proceedings of Advances in Neural Information Processing System,2009.Vancouver,British Columbia,Canada:Springer,2009 [51] Marcelja S.Mathematical description of the responses of simple cortical cells[J].Journal of the Optical Society of America,1980,0(11):1297-1300 [52] Liu Ling-qiao,Wang Lei,Liu Xin-wang.In Defense of Soft-as-signment Coding[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2011.Colorado Springs,CO,USA:IEEE Computer Society,2011:2486-2493 [53] Huang Y,Huang K,Yu Y,et al.Salient Coding for Image Classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2011.Colorado Springs,CO,USA:IEEE Computer Society,2011:1753-1760 [54] Shabou A,LeBorgne H.Locality-constrained and Spatially Regularized Coding for Scene Categorization[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2012.Providence,RI,USA:IEEE Computer Society,2012:3618-3625 [55] Hubel D H,Wiesel T N.Receptive Fields,Binocular Interaction and Functional Architecture in the Cat’s Vsual Cortex[J].The Journal of Physiology,1962,160:106-54 [56] Koenderink J J,Van Doorn A J.The structure of locally orderless images[J].International Journal of Computer Vision,1999,31(2/3):159-168 [57] Fukushima K,Miyake S.Neocognitron:A New Algorithm forPattern Recognition Tolerant of Deformations and Shifts in Position[J].Pattern Recognition,1982,5(6):455-469 [58] LeCun Y,Boser B,Denker J S,et al.Handwritten digit recognition with a back-propagation network[C]∥Proceedings of Conference on Neural Information Processing,1989.Morgan Kaufmann,1990:396-404 [59] Ranzato M,Boureau Y,LeCun Y.Sparse feature learning fordeep belief networks[C]∥Proceedings of Conference on Neural Information Processing,2007.Vancouver,B.C.,Canada:Sprin-ger,2007 [60] Jarrett K,Kavukcuoglu K,Ranzato M,et al.What is the BestMulti-stage Architecture for Object Rcognition?[C]∥Procee-dings of IEEE Conference on Computer Vision and Pattern Recognition,2009.Miami,Florida,USA:IEEE Computer Society,2009:2146-2153 [61] Serre T,Wolf L,Poggio T.Object recognition with features inspired by visual cortex[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2005.San Diego,CA,USA:IEEE Computer Society,2005:994-1000 [62] Pinto N,Cox D,DiCarlo J.Why is real-world visual object recognition hard[J].PLoS Computational Biology,2008,4(1):151-156, [63] Sivic J,Zisserman A.Video Google:A text retrieval approach to object matching in videos[C]∥Proceedings of IEEE International Conferenceon Computer Vision,2003.IEEE Computer Society,2003:1470-1477 [64] Zhang J,Marszalek M,Lazebnik S,et al.Local features and kernels for classifcation of texture and object categories:An in-depth study[J].International Journal of Computer Vision,2007,73(2):213-238 [65] Yang J,Yu K,Gong Y,et al.Linear Spatial Pyramid MatchingUsing Sparse Coding for Image Classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2009.Miami,Florida,USA:IEEE Computer Society,2009:1794-1801 [66] Lazebnik S,Schmid C,Ponce J.Beyond bags of features:Spatial pyramid matching for recognizing natural scene categories[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2006.New York,NY,USA:IEEE Computer Society,2006:2169-2178 [67] Boureau Y,Bach F,LeCun Y,et al.Learning mid-level features for recognition[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2010.San Francisco,CA,USA:IEEE Computer Society,2010:2559-2566 [68] Boureau Y,Ponce J,LeCun Y.A theoretical analysis of feature pooling in vision algorithms[C]∥Proceedings of International Conference on Machine Learning,2010.Haifa,Israel:Omnipress,2010 [69] Feng Jia-shi,Ni Bing-bing,Tian Qi,et al.Geometric p-normFeature Pooling for Image Classification[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2011.Colorado Springs,CO,USA:IEEE Computer Society,2011:2609-2704 [70] Yang Ji-mei,Yang M-H.Learning Hierarchical Image Representation with Sparsity,Saliency and Locality[C]∥British Machine Vision Conference,2011.British:BMVA Press,2011:19.1-19.11 [71] Avila S,Thome N,Cord M,et al.B ossa:Extended Bow Formalism for Image Classification[C]∥Proceedings of International Conference on Image Processing,2011.Brussels,Belgium:IEEE Computer Society,2011:2909-2912 [72] Yu Xin-nan,Zhang Yu-jin.A 2-D Histogram Representation of Images for Pooling[C]∥SPIE.2011 [73] Harada T,Ushiku Y,Yamashita Y,et al.Discriminative Spatial Pyramid[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2011.Colorado Springs,CO,USA:IEEE Computer Society,2011:1617-1624 [74] Cao Yang,Wang Chang-hu,Li Zhi-wei,et al.Spatial-Bag-of-Features[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2010.San Francisco,CA,USA:IEEE Computer Society,2010:3352-3359 [75] Jia Yang-qing,Huang Chang,Darrell T.Beyond Spatial Pyra-mids:Receptive Field Learning for Pooled Image Features[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2012.Providence,RI,USA:IEEE Computer Society,2012:3370-3377 [76] Deng J,Dong W,Socher R,et al.ImageNet:a large-scale hierarchical image database[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2009.Miami,Florida,USA:IEEE Computer Society,2009:248-255 [77] Schmid C,Mohr R,Bauckhage C.Evaluation of interest pointdetectors[J].International Journal of Computer Vision,2000,37(2):151-172 [78] Wang Xing-gang,Bai Xiang,Liu Wen-yu,et al.Feature context for image classification and object detection[C]∥Proceedings of IEEE Conference on Computer Vision and Pattern Recognition,2011.Colorado Springs,CO,USA:IEEE Computer Society,2011:961-968 [79] Malinowski M,Fritz M.Learnable Pooling Regions for ImageClassification.http://arxiv.org/abs/1301.3516 [80] Bruce N D B,Tsotsos J K.Saliency,attention,and visual search:An information theoretic approach[J].Journal of Vision,2009,9(3):1-24 [81] Elazary L,Itti L.Interesting objects are visually salient[J].Journal of Vision,2008,8(3):1-15 [82] Kienzle W,Franz M O,Schlkopf B,et al.Center-surround patterns emerge as optimal predictors for human saccade targets[J].Journal of Vision,2009,9(5):1-15 [83] Tatler B W,Baddeley R J,Gilchrist I D.Visual correlates of fixation selection:Effects of scale and time[J].Vision Research,2005,45(5):643-659 [84] Maree R,Geurts P,Piater J,et al.Raet alndom subwindows for robust image classification[C]∥Proceedings of IEEE International Conferenceon Computer Vision,2005.San Diego,CA,USA:IEEE Computer Society,2005:34-40 [85] Nowak E,Jurie F,Triggs B.Sampling strategies for bag-of-fea-tures image classification[C]∥Proceedings of European Confe-renceon Computer Vision,2006.Graz,Austria:Springer,2006,3954:490-503 [86] Lazebnik S,Schmid C,Ponce J.A sparse texture representation using local affine regions[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2005,7(8):1265-1278 [87] Daugman J G.Two-dimensional spectral analysis of cortical receptive field profile[J].Vision Research ,1980,0(10):847-856 [88] Daugman J G.Uncertainty relation for resolution in space,spatial frequency,and orientation optimized by two-dimensional visual cortical filters[J].Journal of the Optical Society of America A,1985,2(7):1160-1169 [89] Hui Bin,Tang Xu-sheng,Luo Hai-bo,et al.SDF Matched Filter Based on Gabor Wavelet Transform for Face Recognition[J].Information and Control,2008,37(5):633-636 |
No related articles found! |
|