计算机科学 ›› 2019, Vol. 46 ›› Issue (3): 303-313.doi: 10.11896/j.issn.1002-137X.2019.03.045

• 图形图像与模式识别 • 上一篇    下一篇

基于Faster RCNNH的多任务分层图像检索技术

何霞,汤一平,王丽冉,陈朋,袁公萍   

  1. (浙江工业大学信息工程学院 杭州 310023)
  • 收稿日期:2018-02-02 修回日期:2018-06-22 出版日期:2019-03-15 发布日期:2019-03-22
  • 通讯作者: 汤一平(1958-),男,博士,教授,博士生导师,主要研究方向为全方位视觉传感器及应用、计算机视觉,E-mail:typ@zjut.edu.cn
  • 作者简介:何霞(1993-),女,硕士生,主要研究方向为计算机视觉、图像检索、深度学习;陈朋(1993-),男,硕士生,主要研究方向为计算机视觉、人群密度估计、人脸识别;王丽冉(1992-),女,硕士生,主要研究方向为计算机视觉、舌体分割与识别、深度学习;袁公萍(1992-),男,硕士生,主要研究方向为计算机视觉、车辆定位及识别。
  • 基金资助:
    国家自然科学基金项目(61070134,61379078)资助

Multitask Hierarchical Image Retrieval Technology Based on Faster RCNNH

HE Xia, TANG Yi-ping, WANG Li-ran, CHEN Peng, YUAN Gong-ping   

  1. (School of Information Engineering,Zhejiang University of Technology,Hangzhou 310023,China)
  • Received:2018-02-02 Revised:2018-06-22 Online:2019-03-15 Published:2019-03-22

摘要: 针对已有的以图搜图技术中自动化和智能化水平低、缺乏深度学习、难以获取精确的检索结果、检索技术存储空间消耗大、检索速度慢且难以满足大数据时代的图像检索需求等问题,提出了一种基于Faster RCNNH(Faster RCNN Hash)的多任务分层图像检索方法。首先利用选择性检索网络在特征图上进行逻辑回归,得到图像中各感兴趣区域的概率向量,在此基础上结合紧凑量化网络对其进行编码,得到图像紧凑量化哈希码;其次利用再次筛选网络获取各感兴趣区域中响应最大的区域感知语义特征;接着针对每个感兴趣区域,基于量化哈希h矩阵的精检索策略来对图像进行快速比对;最后选出与查询图像中的对应感兴趣区域最相似的图像。提出的多任务学习方法不仅能同时得到图像紧凑量化哈希码和区域感知语义特征,还能有效去除图像背景和其他对象信息的干扰。实验结果表明:所提方法能实现端到端的训练,自动选出更高质量的感兴趣区域特征,提高了大规模图像检索的自动化和智能化水平,其检索精度(0.9478)与检索速度(0.306ks)均明显优于现有的大规模图像检索技术。

关键词: 大规模图像检索, 多任务深度学习, 感兴趣区域, 哈希码, 深度哈希算法

Abstract: Aiming at the problems of low-level automation and intelligence,lack of deep learning,being difficult to obtain high retrieval accuracy,large storage space,slow retrieval speed and hardly meeting the search requirements of big data era for the existing search technologies,this paper proposed a multitask hierarchical image retrieval technology based on faster RCNNH(Faster RCNN Hash).Firstly,the logical regression is performed on the feature map by using the selective retrieval network to obtain the probability vectors of each region of interest in the image.On this basis,the compact quantization network is combined to encode the probability vector and obtain the compact and quantitative hash of the image.Secondly,the re-screening network is utilized to obtain the region-aware semantic features of each region of interest.Then,a precise search strategy based on quantitative hashing matrix is applied into each region of interest to compare the images fast.Finally,the image that is most similar to the corresponding region of interest in the query ima-ge is selected.Meanwhile,the proposed multitask learning method not only can simultaneously obtain compact and quantized hash codes and region-aware semantic features,but also can effectively remove the interference of the background and other objects.The experimental results show that the proposed method can achieve end-to-end training,and the network can automatically select the features with higher quality of the region of interest,thereby improving the automation and intelligence of large-scale image retrieval. The retrieval accuracy (0.9478) and search speed (0.306ks) of the proposed method are both significantly better than the existing large-scale image search technologies.

Key words: Deep hash algorithm, Hash code, Large-scale image retrieval, Multitask deep learning, Region of interest

中图分类号: 

  • TP391.4
[1]SMEULDERS A W M,WORRING M,SANTINI S,et al.Content-based image retrieval at the end of the early years[J].IEEE Transactions on Pattern Analysis & Machine Intelligence,2000,22(12):1349-1380.
[2]WAN J,WANG D,HOI S C H,et al.Deep Learning for Content-Based Image Retrieval:A Comprehensive Study[C]∥Acm International Conference on Multimedia.ACM,2014:157-166.
[3]YUAN J,ZHENG Y,ZHANG C,et al.An interactive-voting
based map matching algorithm[C]∥Proceedings of the 2010 Eleventh International Conference on Mobile Data Management.IEEE Computer Society,2010:43-52.
[4]BAY H,TUYTELAARS T,GOOL L V.SURF:Speeded Up
Robust Features[J].Computer Vision & Image Understanding,2006,110(3):404-417.
[5]QIU G.Indexing chromatic and achromatic patterns for content-based colour image retrieval[J].Pattern Recognition,2002,35(8):1675-1686.
[6]HAYKIN S,KOSKO B.Gradient Based Learning Applied to
Document Recognition[M].New York:Wiley-IEEE Press.2009:306-351.
[7]KRIZHEVSKY A,SUTSKEVER I,HINTON G E.ImageNet
classification with deep convolutional neural networks[C]∥International Conference on Neural Information Processing Systems.Curran Associates Inc.,2012:1097-1105.
[8]SIMONYAN K,ZISSERMAN A.Very Deep Convolutional Net-
works for Large-Scale Image Recognition[J].Computer Scien-ce,2014.arxiv:1409.1556
[9]GIONIS A,INDYK P,MOTWANI R.Similarity Search in High Dimensions via Hashing[C]∥International Conference on Very Large Data Bases.Morgan Kaufmann Publishers Inc.,2000:518-529.
[10]WEISS Y,TORRALBA A,FERGUS R.Spectral Hashing[C]∥
Proceedings of the Twenty-second Annual Conference on Neural Information Processing Systems.Curran Associates Inc.,2008.
[11]CHANG S F.Supervised hashing with kernels[C]∥IEEE Con-
ference on Computer Vision and Pattern Recongnition.2012.
[12]GONG Y,LAZEBNIK S.Iterative quantization:A procrustean approach to learning binary codes[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE ComputerSocie-ty,2011:817-824.
[13]KULIS B,GRAUMANK.Kernelized locality-sensitive hashing
for scalable image search[C]∥IEEE International Conference on Computer Vision.IEEE,2009:2130-2137.
[14]XIA R,PAN Y,LAI H,et al.Supervised hashing for image retrieval via image representation learning[C]∥AAAI Conference on Artificial Intelligence.2014.
[15]LI J Y,LI J H.Supervised hashing binary code with deep CNN for image retrieval[C]∥International Conference on Biomedical Engineering and Informatics.2015:649-655.
[16]LAI H,PAN Y,LIU Y,et al.Simultaneous feature learning and hash coding with deep neural networks[C]∥IEEE Conference on Computer Vision and Patter Recongnition.2015:3270-3278.
[17]LIN K,YANG H F,HSIAO J H,et al.Deep learning of binary hash codes for fast image retrieval[C]∥Computer Vision and Pattern Recognition Workshops.IEEE,2015:27-35.
[18]REN S Q,HE K M,GIRSHICK R,et al.Faster r-cnn:Towards real-time object detection with region proposal networks [C]∥Advances in neural information processing systems (NIPS).Palais des Congrès de Montréal:2015:91-99.
[19]OLIVA A,TORRALBA A.Chapter 2 Building the gist of a
scene:the role of global image features in recognition[J].Progress in Brain Research,2006,155(2):23.
[20]DENG J,DONG W,SOCHER R,et al.ImageNet:A large-scale hierarchical image database[C]∥IEEE Conference on Computer Vision and Pattern Recognition,2009(CVPR 2009).IEEE,2009:248-255.
[21]RAGINSKY M.Locality-Sensitive Binary Codes from Shift-In-
variant Kernels[J].Advances in Neural Information Processing Systems,2009:1509-1517.
[22]GONG Y,LAZEBNIK S.Iterative quantization:A procrustean approach to learning binary codes[C]∥IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Socie-ty,2011:817-824.
[23]YU F X,KUMAR S,GONG Y,et al.Circulant Binary Embedding[J].Computer Science,2014:946-954.arxiv:1405.3162
[24]BERG T,LIU J,LEE S W,et al.Birdsnap:Large-Scale Fine-Grained Visual Categorization of Birds[C]∥Computer Vision and Pattern Recognition.IEEE,2014:2019-2026.
[25]WEISS Y,TORRALBA A,FERGUS R.Spectral hashing[C]∥International Conference on Neural InformationProcessing Systems.Curran Associates Inc.2008:1753-1760.
[26]JIN Z,LI C,LIN Y,et al.Density sensitive hashing[J].IEEE Transactions on Cybernetics,2012,44(8):1362-1371.
[1] 丁荣莉, 李杰, 张曼, 刘艳丽, 伍伟.
基于S-HOG的遥感图像舰船目标检测
Ship Target Detection in Remote Sensing Image Based on S-HOG
计算机科学, 2020, 47(11A): 248-252. https://doi.org/10.11896/jsjkx.191200090
[2] 柴锐, 薛凡, 曾建潮, 秦品乐.
一种医学肾动态显像自动化定量评估方法
Automatic Quantitative Evaluation Approach for Medical Renal Dynamic Imaging
计算机科学, 2019, 46(8): 321-326. https://doi.org/10.11896/j.issn.1002-137X.2019.08.053
[3] 何霞, 汤一平, 袁公萍, 陈朋, 王丽冉.
基于级联多任务深度学习的卡口识别引擎研究
Study on Bayonet Recognition Engine Based on Cascade Multitask Deep Learning
计算机科学, 2019, 46(1): 303-308. https://doi.org/10.11896/j.issn.1002-137X.2019.01.047
[4] 刘刚, 张晶, 李月龙.
基于最大内切圆算法的手掌静脉ROI提取
Extraction of Palm Vein ROI Based on Maximal Inscribed Circle Algorithm
计算机科学, 2018, 45(8): 264-267. https://doi.org/10.11896/j.issn.1002-137X.2018.08.047
[5] 赵扬,王伟,董蓉,王敬时,汤敏.
基于NLTV和NESTA的MRI/MRA图像感兴趣区域的压缩感知重构
Compressed Sensing Recovery Algorithm for Region of Interests of MRI/MRA Images Based on NLTV and NESTA
计算机科学, 2017, 44(9): 308-314. https://doi.org/10.11896/j.issn.1002-137X.2017.09.058
[6] 汤敏,陈秀梅,陈峰.
基于Contourlet变换和SPIHT算法的彩色医学图像压缩
Colorful Medical Image Compression Based on Contourlet Transform and SPIHT Algorithm
计算机科学, 2014, 41(1): 303-306.
[7] 阮若林,胡瑞敏,李忠明.
基于感兴趣区域的率失真优化帧内刷新算法研究
Research on RD Optimized Intra Refreshment Algorithm for Region of Interest
计算机科学, 2009, 36(10): 284-288.
[8] .
基于ROI多特征和相关反馈的图像检索算法

计算机科学, 2008, 35(5): 257-259.
[9] .
基于局部图金字塔的不规则块匹配视频分割方法

计算机科学, 2008, 35(4): 233-237.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!