Computer Science ›› 2019, Vol. 46 ›› Issue (10): 49-54.doi: 10.11896/jsjkx.190100139

• Big Data & Data Science • Previous Articles     Next Articles

Study on Heterogeneous Multimodal Data Retrieval Based on Hash Algorithm

CHEN Feng, MENG Zu-qiang   

  1. (College of Computer and Electronics Information,Guangxi University,Nanning 530000,China)
  • Received:2019-01-17 Revised:2019-03-29 Online:2019-10-15 Published:2019-10-21

Abstract: The development of the era of big data has resulted in an exponentially growing of Internet heterogeneous multimodal data including text,images,video and audio.Therefore,heterogeneous multimodal data retrieval has become a hot direction in big data research.However,heterogeneous multimodal data retrieval encounters two major challenges.The first challenge is how to express the similarity between heterogeneous data while there is a “semantic gap”.The second challenge is how to achieve accurate and efficient retrieval in massive data.To solve the problem that the hash retrieval algorithm ignores semantic similarity of heterogeneous multimodal data,this paper proposed a hash retrieval algorithm based on canonical correlation analysis-semantic consistency,named CCA-SCH.In order to keep semantic consistency within the modality,the CCA-SCH algorithm separately generates semantic models of text and image data.In order to keep semantic consistency between modalities,the CCA algorithm is used to fuse semantics of text and image data to generate the maximum correlation matrix.At the same time,the paradigm 2,ρ is introduced to overcome the noise and redundant information of original datasets,so that the hash function has better robustness.Experiment results show that the mean average precision(Map) of CCA-SCH algorithm is increased by over 10% compared to benchmark algorithms’ performances on experimental data sets,which embodies the better retrieval ability of proposed algorithm.

Key words: Hash function, Semantic consistency, Canonical correlation analysis algorithm, Heterogeneous multimodal

CLC Number: 

  • TP391
[1]MA Q,GU Y,ZHANG T C,et al.A Heterogeneous Multi-Source Multi-Mode Sensory Data Acquisition Method Based on Data Quality[J].Chinese Journal of Computers,2013,36(10):2120-2131.
[2]MAO X J,YANG Y B.Semantic Hashing with Image Subspace Learning[J].Journal of Software,2014,25(8):1781-1793.
[3]CAO Y D,LIU Y Y,SUN F M,et al.LSH with low space com-plexity for image retrieval[J].Computer Engineering & Science,2015,37(2):379-383.
[4]ZHANG L.Research on Locality Sensitive Hashing Based Approximate Nearest Neighbor(s) Searching Algrothm[D].Nanjing:Nanjing University of Posts and Telecommunications,2015.
[5]LIU H,WANG R,SHAN S,et al.Deep Supervised Hashing for Fast Image Retrieval[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:2064-2072.
[6]WEN Q F,WANG J M,ZHU H,et al.Distributed Learning to Hash for Approximate Nearest Neighbor Search[J].Chinese Journal of Computers,2017,40(1):192-206.
[7]TANG J,WANG K,SHAO L.Supervised Matrix Factorization Hashing for Cross-Modal Retrieval[J].IEEE Transactions on Image Processing,2016,25(7):3157-3166.
[8]ZHANG L,ZHAO Y,ZHU Z F.Advances in Semantically Shared Subspace Learning for Cross-Media Data[J].Chinese Journal of Computers,2017,40(6):168-195.
[9]WANG K,TANG J,WANG N,et al.Semantic Boosting Cross-Modal Hashing for efficient multimedia retrieval[J].Information Sciences,2016,330(C):199-210.
[10]WANG D,GAO X,WANG X,et al.Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval[J].IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society,2016,25(10):4540-4554.
[11]LI X,GAO L,XU X,et al.Kernel based Latent Semantic Sparse Hashing for Large-scale Retrieval from Heterogeneous Data Sources[J].Neurocomputing,2017,253:89-96.
[12]DATAR M,IMMORLICA N,INDYK P,et al.Locality-sensitive hashing scheme based on p-stable distributions[C]//Twentieth Symposium on Computational Geometry.ACM,2004:253.
[13]DING G,GUO Y,ZHOU J.Collective Matrix Factorization Hashing for Multimodal Data[C]//Computer Vision and Pattern Recognition.IEEE,2014:2083-2090.
[14]ZHU Y Y.Research on Semantic Consistency and Matrix Factorization based Cross-modal Hashing Retrieval[D].Hefei:Anhui University,2017.
[15]HOTELLING H.Relations Between Two Sets of Variates[J].Biometrika,1936,28(3/4):321-377.
[16]ZHANG D,LI W J.Large-scale supervised multimodal hashing with semantic correlation maximization[C]//Twenty-Eighth AAAI Conference on Artificial Intelligence.AAAI Press,2014:2177-2183.
[17]LIN Z,DING G,HU M,et al.Semantics-preserving hashing for cross-view retrieval[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2015:3864-3872.
[1] GAN Yong, WANG Kai, HE Lei. New Ownership Transfer Protocol of RFID Tag [J]. Computer Science, 2018, 45(11A): 369-372.
[2] LIU Shao-ji, CAO Yang and CUI Meng-tian. Parameter Analysis and Optimization of Cardinality Estimation Algorithm [J]. Computer Science, 2017, 44(2): 279-282.
[3] ZHU Shu-qin, LI Jun-qing and GE Guang-ying. New Image Encryption Algorithm Based on New Four-dimensional Discrete-time Chaotic Map [J]. Computer Science, 2017, 44(1): 188-193.
[4] HU Yun-shan, SHEN Yi, ZENG Guang and HAN Wen-bao. New Algorithm for Automatic Deriving Sufficient Conditions of SHA-1 [J]. Computer Science, 2016, 43(8): 123-127.
[5] WANG Jie-hua, LIU Hui-ping, SHAO Hao-ran and XIA Hai-yan. Novel Two-way Security Authentication Wireless Scheme Based on Hash Function [J]. Computer Science, 2016, 43(11): 205-209.
[6] WANG Peng-chao, DU Hui-min, CAO Guang-jie, DU Qin-qin and DING Jia-long. Design and Implement of Exact Matching Algorithm Based on Bloom Filter [J]. Computer Science, 2015, 42(Z6): 429-434.
[7] XIE Wen-bing, JIANG Jun, LI Zhong-sheng and NIU Xia-mu. Improved Algorithm for Buffer Overflow Detection Based on Libsafe Library [J]. Computer Science, 2015, 42(Z6): 382-387.
[8] TANG Cheng-hua, WANG Li-na, QIANG Bao-hua, TANG Shen-sheng and ZHANG Xin. Static Security Policy Consistency Detection Based on Semantic Similarity [J]. Computer Science, 2015, 42(8): 166-169.
[9] ZHAO Jia-jia and REN Ping-an. Random Network Coding Based on Anti-eavesdropping and Byzantine Adversaries [J]. Computer Science, 2014, 41(9): 174-177.
[10] SU Jia-jun and WANG Xin-mei. New Three-level Symmetry Scheme of Traitor Tracing [J]. Computer Science, 2013, 40(8): 96-99.
[11] GAO Shu-jing and WANG Hong-jun. Research on PRNG Suitable for UHF RFID Tag [J]. Computer Science, 2013, 40(7): 102-106.
[12] ZOU You-jiao,MA Wen-ping,RAN Zhan-jun and CHEN He-feng. Improved Multivariate Hash Function [J]. Computer Science, 2013, 40(6): 45-48.
[13] . Chameleon Signature Scheme Based on Lattice [J]. Computer Science, 2013, 40(2): 117-119.
[14] DENG Shu-hua,ZHAO Ze-mao. Secure and Reliable Centralized Multicast Key Management Scheme [J]. Computer Science, 2011, 38(Z10): 50-52.
[15] MAO Ming,HE Qiang,ZENG Shao-kun,ZHANG Jun. Security Analysis of Resistance against Differential-linear Attack on BLAKE-32 [J]. Computer Science, 2011, 38(7): 76-79.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LEI Li-hui and WANG Jing. Parallelization of LTL Model Checking Based on Possibility Measure[J]. Computer Science, 2018, 45(4): 71 -75 .
[2] SUN Qi, JIN Yan, HE Kun and XU Ling-xuan. Hybrid Evolutionary Algorithm for Solving Mixed Capacitated General Routing Problem[J]. Computer Science, 2018, 45(4): 76 -82 .
[3] ZHANG Jia-nan and XIAO Ming-yu. Approximation Algorithm for Weighted Mixed Domination Problem[J]. Computer Science, 2018, 45(4): 83 -88 .
[4] WU Jian-hui, HUANG Zhong-xiang, LI Wu, WU Jian-hui, PENG Xin and ZHANG Sheng. Robustness Optimization of Sequence Decision in Urban Road Construction[J]. Computer Science, 2018, 45(4): 89 -93 .
[5] SHI Wen-jun, WU Ji-gang and LUO Yu-chun. Fast and Efficient Scheduling Algorithms for Mobile Cloud Offloading[J]. Computer Science, 2018, 45(4): 94 -99 .
[6] ZHOU Yan-ping and YE Qiao-lin. L1-norm Distance Based Least Squares Twin Support Vector Machine[J]. Computer Science, 2018, 45(4): 100 -105 .
[7] LIU Bo-yi, TANG Xiang-yan and CHENG Jie-ren. Recognition Method for Corn Borer Based on Templates Matching in Muliple Growth Periods[J]. Computer Science, 2018, 45(4): 106 -111 .
[8] GENG Hai-jun, SHI Xin-gang, WANG Zhi-liang, YIN Xia and YIN Shao-ping. Energy-efficient Intra-domain Routing Algorithm Based on Directed Acyclic Graph[J]. Computer Science, 2018, 45(4): 112 -116 .
[9] CUI Qiong, LI Jian-hua, WANG Hong and NAN Ming-li. Resilience Analysis Model of Networked Command Information System Based on Node Repairability[J]. Computer Science, 2018, 45(4): 117 -121 .
[10] WANG Zhen-chao, HOU Huan-huan and LIAN Rui. Path Optimization Scheme for Restraining Degree of Disorder in CMT[J]. Computer Science, 2018, 45(4): 122 -125 .