计算机科学 ›› 2021, Vol. 48 ›› Issue (2): 148-152.doi: 10.11896/jsjkx.191200104
王晓飞1, 周超2, 刘利刚1
WANG Xiao-fei1, ZHOU Chao2, LIU Li-gang1
摘要: 在图像搜索的场景中,由于搜索请求的随机性,为了提高搜索速度,搜索算法运行时需要把整个数据集预先载入到运行内存。由于运行内存价格远高于同容量的硬盘价格,降低运行内存自然可以大大降低图像搜索服务的成本,但如果直接对数据进行压缩,往往会极大地损失搜索精度。在这种情况下,文中提出了一种基于图像内容特征的分块式图像搜索框架。先利用神经网络的方法来预先提取图片特征,在不对特征进行量化压缩的前提下,采用一种启发式的聚类方法对数据进行分块,同时保证每个数据块的数据之间有一定的相似性。对于每个数据块,采用基于图结构的HNSW算法来构建索引子图以加速图片查询。在该框架下,通过控制查询时访问的数据块的个数,可以在保证精度的前提下大大减少算法所需要的运行内存容量。
中图分类号:
[1] COX I J,MILLER M L,MINKA T P,et al.The Bayesian image retrieval system,PicHunter:theory,implementation,and psychophysical experiments[J].IEEE Transactions on Image Processing,2002,9(1):20-37. [2] KIRANYAZ S,GABBOUJ M.Hierarchical CellularTree:AnEfficient Indexing Scheme for Content-Based Retrieval on Multimedia Databases[J].IEEE Transactions on Multimedia,2007,9(1):102-119. [3] BABENKO A,LEMPITSKY V.Aggregating Deep Convolutional Features for Image Retrieval[J].arXiv:1510.07493v1,2015. [4] BABENKO A,SLESAREV A,CHIGORIN A,et al.Neuralcodes for image retrieval[C]//European Conference on Computer Vision.Cham:Springer,2014:584-599. [5] JEGOU H,DOUZE M,SCHMID C,et al.Aggregating local descriptors into a compact image representation[C]//Computer Vision and Pattern Recognition (CVPR),2010 IEEE Conference on.IEEE,2010. [6] PERRONNIN F,LIU Y,JORGE S,et al.Large-scale image retrieval with compressed Fisher vectors[C]//CVPR.IEEE,2010. [7] HERVÉ J,ANDREW Z.Triangulation Embedding and Democratic Aggregation for Image Search[C]//In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'14).IEEE Computer Society,Washington,DC,USA,3310-3317. [8] SYED S H,MIROSLAW B.Improving Large-Scale Image Retrieval Through Robust Aggregation of Local Descriptors[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2017,39(9):1783-1796. [9] DO T T,TAN D K L,PHAM T T,et al.Simultaneous Feature Aggregating and Hashing for Large-Scale Image Search[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2017. [10] BENTLEY J L.Multidimensional binary search trees used for associative searching[J].Com. of ACM,1975,18(9):509-517. [11] FINKEL R A,BENTLEY J L.Quad trees:a data structure for retrieval on composite keys[J].Acta Inf.1974,4(1):1-9. [12] LEE D T,WONG C K.Worst-case analysis for region and partial region searches in multidimensional binary search trees and balanced quad trees[J].Acta Inf.1977,9(1):23-29. [13] MALKOV Y,PONOMARENKO A,LOGVINOVA,et al.Ap-proximate nearest neighbor algorithm based on navigable small world graphs[J].Information Systems,2014,45:61-68. [14] MALKOV Y A,YASHUNIN D A.Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2016,PP(99). [15] MATTHIJS D,ALEXANDRE S,HERVE J.Link and Code:Fast Indexing with Graphs and Compact Regression Codes[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2018. [16] MARTIN A,BERNHARDSSON E,FAITHFULL A.ANN-Benchmarks:A Benchmarking Tool for Approximate Nearest Neighbor Algorithms[C]//International Conference on Simila-rity Search and Applications.2017. [17] MACQUEEN J.Some Methods for Classification and Analysis of Multi Variate Observations[C]//Proc.of Berkeley Sympo-sium on Mathematical Statistics & Probability.1965. [18] LI Y,CHUNG S M.Parallel bisecting k-means with prediction clusteringalgorithm[J].Journal of Supercomputing,2007,39(1):19-37. [19] MCCALLUM A,NIGAM K,UNGAR L H.Efficient clustering of high-dimensional data sets with application to reference ma-tching[C]//Proceedings of the Sixth International Conference on Knowledge Discovery and Data Mining.DBLP,2000. [20] ARTHUR D,VASSILVITSKII S.k-means++:the advantages of careful seeding[C]//Eighteenth Acm-Siam Symposium on Discrete Algorithms.Society for Industrial and Applied Mathematics.2007:1027-1035. [21] SIMONYAN K,ZISSERMAN A.Very deep convolutional networks for large-scale image recognition[J].arXiv:1409.1556,2014. [22] JIA Y,SHELHAMER E,DONAHUE J,et al.Caffe:Convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia.ACM,2014:675-678. |
[1] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于分层抽样优化的面向异构客户端的联邦学习 Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients 计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263 |
[2] | 柴慧敏, 张勇, 方敏. 基于特征相似度聚类的空中目标分群方法 Aerial Target Grouping Method Based on Feature Similarity Clustering 计算机科学, 2022, 49(9): 70-75. https://doi.org/10.11896/jsjkx.210800203 |
[3] | 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩. 基于DBSCAN聚类的集群联邦学习方法 Clustered Federated Learning Methods Based on DBSCAN Clustering 计算机科学, 2022, 49(6A): 232-237. https://doi.org/10.11896/jsjkx.211100059 |
[4] | 郁舒昊, 周辉, 叶春杨, 王太正. SDFA:基于多特征融合的船舶轨迹聚类方法研究 SDFA:Study on Ship Trajectory Clustering Method Based on Multi-feature Fusion 计算机科学, 2022, 49(6A): 256-260. https://doi.org/10.11896/jsjkx.211100253 |
[5] | 毛森林, 夏镇, 耿新宇, 陈剑辉, 蒋宏霞. 基于密度敏感距离和模糊划分的改进FCM算法 FCM Algorithm Based on Density Sensitive Distance and Fuzzy Partition 计算机科学, 2022, 49(6A): 285-290. https://doi.org/10.11896/jsjkx.210700042 |
[6] | 陈景年. 一种适于多分类问题的支持向量机加速方法 Acceleration of SVM for Multi-class Classification 计算机科学, 2022, 49(6A): 297-300. https://doi.org/10.11896/jsjkx.210400149 |
[7] | 刘丽, 李仁发. 医疗CPS协作网络控制策略优化 Control Strategy Optimization of Medical CPS Cooperative Network 计算机科学, 2022, 49(6A): 39-43. https://doi.org/10.11896/jsjkx.210300230 |
[8] | 陈佳舟, 赵熠波, 徐阳辉, 马骥, 金灵枫, 秦绪佳. 三维城市场景中的小物体检测 Small Object Detection in 3D Urban Scenes 计算机科学, 2022, 49(6): 238-244. https://doi.org/10.11896/jsjkx.210400174 |
[9] | 邢云冰, 龙广玉, 胡春雨, 忽丽莎. 基于SVM的类别增量人体活动识别方法 Human Activity Recognition Method Based on Class Increment SVM 计算机科学, 2022, 49(5): 78-83. https://doi.org/10.11896/jsjkx.210400024 |
[10] | 朱哲清, 耿海军, 钱宇华. 面向化学结构的线段聚类算法 Line-Segment Clustering Algorithm for Chemical Structure 计算机科学, 2022, 49(5): 113-119. https://doi.org/10.11896/jsjkx.210700131 |
[11] | 张宇姣, 黄锐, 张福泉, 隋栋, 张虎. 基于菌群优化的近邻传播聚类算法研究 Study on Affinity Propagation Clustering Algorithm Based on Bacterial Flora Optimization 计算机科学, 2022, 49(5): 165-169. https://doi.org/10.11896/jsjkx.210800218 |
[12] | 左园林, 龚月姣, 陈伟能. 成本受限条件下的社交网络影响最大化方法 Budget-aware Influence Maximization in Social Networks 计算机科学, 2022, 49(4): 100-109. https://doi.org/10.11896/jsjkx.210300228 |
[13] | 韩洁, 陈俊芬, 李艳, 湛泽聪. 基于自注意力的自监督深度聚类算法 Self-supervised Deep Clustering Algorithm Based on Self-attention 计算机科学, 2022, 49(3): 134-143. https://doi.org/10.11896/jsjkx.210100001 |
[14] | 杨旭华, 王磊, 叶蕾, 张端, 周艳波, 龙海霞. 基于节点相似性和网络嵌入的复杂网络社区发现算法 Complex Network Community Detection Algorithm Based on Node Similarity and Network Embedding 计算机科学, 2022, 49(3): 121-128. https://doi.org/10.11896/jsjkx.210200009 |
[15] | 蒲实, 赵卫东. 一种面向动态科研网络的社区检测算法 Community Detection Algorithm for Dynamic Academic Network 计算机科学, 2022, 49(1): 89-94. https://doi.org/10.11896/jsjkx.210100023 |
|