Computer Science ›› 2020, Vol. 47 ›› Issue (11): 199-204.doi: 10.11896/jsjkx.190800145

• Computer Graphics & Multimedia • Previous Articles     Next Articles

Sketch-based Image Retrieval Based on Attention Model

LI Zong-min1, LI Si-yuan1, LIU Yu-jie1, LI Hua2   

  1. 1 College of Computer & Communication Engineering,China University of Petroleum,Qingdao,Shandong 266580,China
    2 Institute of Computing Technology,Chinese Academy of Sciences,Beijing 100190,China
  • Received:2019-08-28 Revised:2019-12-16 Online:2020-11-15 Published:2020-11-05
  • About author:LI Zong-min,born in 1965,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include computer graphics,image processing,and scienti-fic computing visualization.
    LI Si-yuan,born in 1996,postgraduate.His main research interests include computer vision,image processing,ima-ge retrieval,and sketch image recognition.
  • Supported by:
    This work was supported by the National Natural Science Foundation of China (61379106,61379082,61227802) and Natural Science Foundation of Shandong Province (ZR2013FM036,ZR2015FM011).

Abstract: To solve the problems of the sparse features and the geometric distortion of hand-drawn images in the research field of SBIR (sketch based image retrieval),a new feature extraction method based on attention model is proposed in this paper.The retrieval results can be obtained efficiently and accurately by accurately extracting the semantic features of hand-drawn images.Firstly,convolutional neural network is used as the basic framework for extracting semantic features,and then the supervised training process is carried out.Attention model mechanism is introduced to locate effective semantic features by adding attention block after the last convolution layer of the convolution neural network,and the attention block is composed of spatial attention structure and channel attention structure.Finally,the final feature descriptor is formed by the fusion of semantic features in different layers,to realize high retrieval accuracy.The experimental results on benchmark Flickr15k dataset proves the feasibility and effectiveness of the proposed method.In addition,the proposed attention model can greatly improve the classification accuracy in the task of sketch classification.

Key words: Attention model, Convolutional neural network, Sketch classification, Sketch-based image retrieval

CLC Number: 

  • TP391.41
[1] EITZ M,HAYS J,AlEXA M.How do humans sketch objects?[J].Acm Transactions on Graphics,2012,31(4):44.
[2] HU R,COLLOMOSSE J.A performance evaluation of gradient field hog descriptor for sketch based image retrieval[J].Computer Vision and Image Understanding,2013,117(7):790-806.
[3] EITZ M,HILDEBLAND K,BOUBEKEUR T,et al.A descriptor for large scale image retrieval based on sketched feature lines[C]//Proceedings of Eurographics Symposium on Sketch-based Interfaces & Modeling.ACM,2009:29-36.
[4] HU R,BARNARD M,COLLOMOSSE J P.Gradient field de-scriptor for sketch based retrieval and localization[C]//Procee-dings of IEEE International Conference on Image Processing.IEEE,2010:1025-1028.
[5] EITZ M,HILDEBRAND K,BOUBEKEUR T,et al.Sketch-Based Image Retrieval:Benchmark and Bag-of-Features Descriptors[J].IEEE Transactions on Visualization and Computer Graphics,2011,17(11):1624-1636.
[6] LOWE D G.Distinctive image features from scale-invariant keypoints[J].International Journal of Computer Vision,2004,60(2):91-110.
[7] DALAL N,TRIGGS B.Histograms of oriented gradients forhuman detection[C]//Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.Los Alamitos:IEEE Computer Society Press,2005:886-893.
[8] YU Q,SONG Y Z,ZHANG H,et al.Sketch-based image retrieval via Siamese convolutional neural network[C]//Procee-dings of IEEE International Conference on Image Processing.IEEE Computer Society Press,2016.
[9] WANG X,DUAN X,BAI X.Deep Sketch Feature for Cross-domain Image Retrieval[J].Neurocomputing 2016,207:387-397.
[10] LIU Y J,YU D,PANG Y P.Sketch Based Image Retrival Based on Multi-layer Semantic Feature and Deep Convoluntional Neural Network[J].Journal of Computer-Aided Design and Computer Graphics,2018,30(4):651-657.
[11] LIU Y J,PANG Y P,LU Z Q,et al.Sketch Based Image Retrieval Based on Chamfer Distance Transform and Bag of Mid Maps Descriptor [J].Journal of Computer-Aided Design & Computer Graphics,2016,28(12):2168-2174.
[12] BAI X,LI Q,LATECKI L J,et al.Shape band:A deformable object detection approach[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.IEEE,2009.
[13] MORI G,BELONGIE S,MALIK J.Efficient shape matchingusing shape contexts[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2005,27(11):1832-1837.
[14] THAYANANTHAN A,STENGER B,TORR P H S,et al.Shape context and chamfer matching in cluttered scenes[J].IEEE Computer Society Conference on Computer Vision and Pattern Recognition,2003,1:127-133.
[15] XIA G S,DELON J,GOUSSEAU Y.Shape-based InvariantTexture Indexing[J].International Journal of Computer Vision,2010,88(3):382-403.
[16] TOLIAS G,CHUM O.Efficient Contour Match Kernel[J].Image & Vision Computing,2018,76:14-26.
[17] LIU Y J,DOU C H,ZHAO Q L.Sketch Based Image Retrival with Conditional Generative Adversarial Network[J].Journal of Computer-Aided Design and Computer Graphics,2017,29(12):2336-2342.
[18] BUI T,RIBEIRO L,PONTI M,et al.Sketching out the details:Sketch-based image retrieval using convolutional neural networks with multi-stage regression[J].Computers & Graphics,2018,71:77-87.
[19] LU J,XIONG C,PARIKH D,et al.Knowing when to look:Adaptive attention via a visual sentinel for image captioning[EB/OL].[2016-02-06].https://arxiv.org/abs/1612.01887.
[20] MNIH V,HEESS N,GRAVES A,et al.Recurrent Models of Visual Attention[J].Advances in neural information processing systems,2014,2:2204-2212.
[21] NOH H,ARAUJO A,SIM J,et al.Large-Scale Image Retrieval with Attentive Deep Local Features[C]//Proceedings of IEEE International Conference on Computer Vision (ICCV).IEEE Computer Society,2017.
[22] XIAO N T,XU N Y,YANGN K,et al.The application of two-level attention models in deep convolutional neural network for fine-grained image classification[C]//IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE Computer Society,2015:2.
[23] HU J,LI S,ALBANIE S,et al.Squeeze-and-excitation networks[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,99:1-1.
[24] WOO S,PARK J,LEE J Y,et al.Convolutional block attention module[C]//Proceedings of the EuropeanConference on Computer Vision (ECCV).2018:3-19.
[25] SONG J,YU Q,SONG Y Z,et al.Deep Spatial-Semantic Attention for Fine-Grained Sketch-Based Image Retrieval[C]//IEEE International Conference on Computer Vision (ICCV).IEEE Computer Society,2017.
[26] SIMONYAN K,ZISSERMN A.Very deep convolutional net-works for large-scale image recognition[EB/OL].[2017-06-15].https://arxiv.org/abs/1409.1556.
[27] CHOPRA S,HADSELL R,LECCUN Y.Learning a similarity metric discriminatively,with application to face verification[C]//Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE Computer Society,2005:539-546.
[28] YU Q,YANG Y,SONG Y Z,et al.Sketch-a-net that beats humans[EB/OL].[2017-06-15].https://arxiv.org/abs/1501.07873.
[29] JOLY A,BUISSON O.Logo retrieval with a contrario visualquery expansion[C]//International Conference on Multimedia.2009.
[30] LI Y,HOSPEDALES T M,SONG Y Z,et al.Free-hand sketch recognition by multi-kernel feature learning[J].Computer Vision and Image Understanding,2015,137:1-11.
[31] SCHNEIDER,ROSALIA G,TUYTELAARS T.Sketch classification and classification-driven analysis using Fisher vectors[J].ACM Transactions on Graphics,2014,33(6):1-9.
[32] ZHONG Y,ZHANG H G,GUO J S,et al.Directional Element HOG for Sketch Recognition[C]//International Conference on Network Infrastructure and Digital Content (IC-NIDC).2018.
[33] PRABHU A,BATCHU V,GAJAWADA R,et al.Hybrid Binary Networks:Optimizing for Accuracy,Efficiency and Memory[C]//IEEE Winter Conference on Applications of Computer Vision (WACV).2018,10:821-829.
[34] MISHRA,SINGH A K.Deep Embedding using Bayesian RiskMinimization with Application to Sketch Recognition[EB/OL].[2018-12-6].https://arxiv.org/abs/1812.02466.
[35] LI L,ZOU C,ZHENG Y,et al.Sketch-R2CNN:An Attentive Network for Vector Sketch Recognition[EB/OL].[2018-11-20].https://arxiv.org/abs/1811.08170.
[1] WU Zi-yi, LI Shao-mei, JIANG Meng-han, ZHANG Jian-peng. Ontology Alignment Method Based on Self-attention [J]. Computer Science, 2022, 49(9): 215-220.
[2] ZHOU Le-yuan, ZHANG Jian-hua, YUAN Tian-tian, CHEN Sheng-yong. Sequence-to-Sequence Chinese Continuous Sign Language Recognition and Translation with Multi- layer Attention Mechanism Fusion [J]. Computer Science, 2022, 49(9): 155-161.
[3] CHEN Yong-quan, JIANG Ying. Analysis Method of APP User Behavior Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(8): 78-85.
[4] ZHU Cheng-zhang, HUANG Jia-er, XIAO Ya-long, WANG Han, ZOU Bei-ji. Deep Hash Retrieval Algorithm for Medical Images Based on Attention Mechanism [J]. Computer Science, 2022, 49(8): 113-119.
[5] DAI Zhao-xia, LI Jin-xin, ZHANG Xiang-dong, XU Xu, MEI Lin, ZHANG Liang. Super-resolution Reconstruction of MRI Based on DNGAN [J]. Computer Science, 2022, 49(7): 113-119.
[6] LIU Yue-hong, NIU Shao-hua, SHEN Xian-hao. Virtual Reality Video Intraframe Prediction Coding Based on Convolutional Neural Network [J]. Computer Science, 2022, 49(7): 127-131.
[7] XU Ming-ke, ZHANG Fan. Head Fusion:A Method to Improve Accuracy and Robustness of Speech Emotion Recognition [J]. Computer Science, 2022, 49(7): 132-141.
[8] YANG Yue, FENG Tao, LIANG Hong, YANG Yang. Image Arbitrary Style Transfer via Criss-cross Attention [J]. Computer Science, 2022, 49(6A): 345-352.
[9] YANG Jian-nan, ZHANG Fan. Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure [J]. Computer Science, 2022, 49(6A): 353-357.
[10] ZHANG Jia-hao, LIU Feng, QI Jia-yin. Lightweight Micro-expression Recognition Architecture Based on Bottleneck Transformer [J]. Computer Science, 2022, 49(6A): 370-377.
[11] WANG Jian-ming, CHEN Xiang-yu, YANG Zi-zhong, SHI Chen-yang, ZHANG Yu-hang, QIAN Zheng-kun. Influence of Different Data Augmentation Methods on Model Recognition Accuracy [J]. Computer Science, 2022, 49(6A): 418-423.
[12] SUN Jie-qi, LI Ya-feng, ZHANG Wen-bo, LIU Peng-hui. Dual-field Feature Fusion Deep Convolutional Neural Network Based on Discrete Wavelet Transformation [J]. Computer Science, 2022, 49(6A): 434-440.
[13] WU Zi-bin, YAN Qiao. Projected Gradient Descent Algorithm with Momentum [J]. Computer Science, 2022, 49(6A): 178-183.
[14] ZHAO Zheng-peng, LI Jun-gang, PU Yuan-yuan. Low-light Image Enhancement Based on Retinex Theory by Convolutional Neural Network [J]. Computer Science, 2022, 49(6): 199-209.
[15] HU Fu-yuan, WAN Xin-jun, SHEN Ming-fei, XU Jiang-lang, YAO Rui, TAO Zhong-ben. Survey Progress on Image Instance Segmentation Methods of Deep Convolutional Neural Network [J]. Computer Science, 2022, 49(5): 10-24.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!