Computer Science ›› 2019, Vol. 46 ›› Issue (10): 49-54.doi: 10.11896/jsjkx.190100139

Special Issue: Database Technology

• Big Data & Data Science • Previous Articles     Next Articles

Study on Heterogeneous Multimodal Data Retrieval Based on Hash Algorithm

CHEN Feng, MENG Zu-qiang   

  1. (College of Computer and Electronics Information,Guangxi University,Nanning 530000,China)
  • Received:2019-01-17 Revised:2019-03-29 Online:2019-10-15 Published:2019-10-21

Abstract: The development of the era of big data has resulted in an exponentially growing of Internet heterogeneous multimodal data including text,images,video and audio.Therefore,heterogeneous multimodal data retrieval has become a hot direction in big data research.However,heterogeneous multimodal data retrieval encounters two major challenges.The first challenge is how to express the similarity between heterogeneous data while there is a “semantic gap”.The second challenge is how to achieve accurate and efficient retrieval in massive data.To solve the problem that the hash retrieval algorithm ignores semantic similarity of heterogeneous multimodal data,this paper proposed a hash retrieval algorithm based on canonical correlation analysis-semantic consistency,named CCA-SCH.In order to keep semantic consistency within the modality,the CCA-SCH algorithm separately generates semantic models of text and image data.In order to keep semantic consistency between modalities,the CCA algorithm is used to fuse semantics of text and image data to generate the maximum correlation matrix.At the same time,the paradigm 2,ρ is introduced to overcome the noise and redundant information of original datasets,so that the hash function has better robustness.Experiment results show that the mean average precision(Map) of CCA-SCH algorithm is increased by over 10% compared to benchmark algorithms’ performances on experimental data sets,which embodies the better retrieval ability of proposed algorithm.

Key words: Canonical correlation analysis algorithm, Hash function, Heterogeneous multimodal, Semantic consistency

CLC Number: 

  • TP391
[1]MA Q,GU Y,ZHANG T C,et al.A Heterogeneous Multi-Source Multi-Mode Sensory Data Acquisition Method Based on Data Quality[J].Chinese Journal of Computers,2013,36(10):2120-2131.
[2]MAO X J,YANG Y B.Semantic Hashing with Image Subspace Learning[J].Journal of Software,2014,25(8):1781-1793.
[3]CAO Y D,LIU Y Y,SUN F M,et al.LSH with low space com-plexity for image retrieval[J].Computer Engineering & Science,2015,37(2):379-383.
[4]ZHANG L.Research on Locality Sensitive Hashing Based Approximate Nearest Neighbor(s) Searching Algrothm[D].Nanjing:Nanjing University of Posts and Telecommunications,2015.
[5]LIU H,WANG R,SHAN S,et al.Deep Supervised Hashing for Fast Image Retrieval[C]//IEEE Conference on Computer Vision and Pattern Recognition.IEEE Computer Society,2016:2064-2072.
[6]WEN Q F,WANG J M,ZHU H,et al.Distributed Learning to Hash for Approximate Nearest Neighbor Search[J].Chinese Journal of Computers,2017,40(1):192-206.
[7]TANG J,WANG K,SHAO L.Supervised Matrix Factorization Hashing for Cross-Modal Retrieval[J].IEEE Transactions on Image Processing,2016,25(7):3157-3166.
[8]ZHANG L,ZHAO Y,ZHU Z F.Advances in Semantically Shared Subspace Learning for Cross-Media Data[J].Chinese Journal of Computers,2017,40(6):168-195.
[9]WANG K,TANG J,WANG N,et al.Semantic Boosting Cross-Modal Hashing for efficient multimedia retrieval[J].Information Sciences,2016,330(C):199-210.
[10]WANG D,GAO X,WANG X,et al.Multimodal Discriminative Binary Embedding for Large-Scale Cross-Modal Retrieval[J].IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society,2016,25(10):4540-4554.
[11]LI X,GAO L,XU X,et al.Kernel based Latent Semantic Sparse Hashing for Large-scale Retrieval from Heterogeneous Data Sources[J].Neurocomputing,2017,253:89-96.
[12]DATAR M,IMMORLICA N,INDYK P,et al.Locality-sensitive hashing scheme based on p-stable distributions[C]//Twentieth Symposium on Computational Geometry.ACM,2004:253.
[13]DING G,GUO Y,ZHOU J.Collective Matrix Factorization Hashing for Multimodal Data[C]//Computer Vision and Pattern Recognition.IEEE,2014:2083-2090.
[14]ZHU Y Y.Research on Semantic Consistency and Matrix Factorization based Cross-modal Hashing Retrieval[D].Hefei:Anhui University,2017.
[15]HOTELLING H.Relations Between Two Sets of Variates[J].Biometrika,1936,28(3/4):321-377.
[16]ZHANG D,LI W J.Large-scale supervised multimodal hashing with semantic correlation maximization[C]//Twenty-Eighth AAAI Conference on Artificial Intelligence.AAAI Press,2014:2177-2183.
[17]LIN Z,DING G,HU M,et al.Semantics-preserving hashing for cross-view retrieval[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).IEEE,2015:3864-3872.
[1] GAN Yong, WANG Kai, HE Lei. New Ownership Transfer Protocol of RFID Tag [J]. Computer Science, 2018, 45(11A): 369-372.
[2] LIU Shao-ji, CAO Yang and CUI Meng-tian. Parameter Analysis and Optimization of Cardinality Estimation Algorithm [J]. Computer Science, 2017, 44(2): 279-282.
[3] ZHU Shu-qin, LI Jun-qing and GE Guang-ying. New Image Encryption Algorithm Based on New Four-dimensional Discrete-time Chaotic Map [J]. Computer Science, 2017, 44(1): 188-193.
[4] HU Yun-shan, SHEN Yi, ZENG Guang and HAN Wen-bao. New Algorithm for Automatic Deriving Sufficient Conditions of SHA-1 [J]. Computer Science, 2016, 43(8): 123-127.
[5] WANG Jie-hua, LIU Hui-ping, SHAO Hao-ran and XIA Hai-yan. Novel Two-way Security Authentication Wireless Scheme Based on Hash Function [J]. Computer Science, 2016, 43(11): 205-209.
[6] XIE Wen-bing, JIANG Jun, LI Zhong-sheng and NIU Xia-mu. Improved Algorithm for Buffer Overflow Detection Based on Libsafe Library [J]. Computer Science, 2015, 42(Z6): 382-387.
[7] WANG Peng-chao, DU Hui-min, CAO Guang-jie, DU Qin-qin and DING Jia-long. Design and Implement of Exact Matching Algorithm Based on Bloom Filter [J]. Computer Science, 2015, 42(Z6): 429-434.
[8] TANG Cheng-hua, WANG Li-na, QIANG Bao-hua, TANG Shen-sheng and ZHANG Xin. Static Security Policy Consistency Detection Based on Semantic Similarity [J]. Computer Science, 2015, 42(8): 166-169.
[9] ZHAO Jia-jia and REN Ping-an. Random Network Coding Based on Anti-eavesdropping and Byzantine Adversaries [J]. Computer Science, 2014, 41(9): 174-177.
[10] SU Jia-jun and WANG Xin-mei. New Three-level Symmetry Scheme of Traitor Tracing [J]. Computer Science, 2013, 40(8): 96-99.
[11] GAO Shu-jing and WANG Hong-jun. Research on PRNG Suitable for UHF RFID Tag [J]. Computer Science, 2013, 40(7): 102-106.
[12] ZOU You-jiao,MA Wen-ping,RAN Zhan-jun and CHEN He-feng. Improved Multivariate Hash Function [J]. Computer Science, 2013, 40(6): 45-48.
[13] . Chameleon Signature Scheme Based on Lattice [J]. Computer Science, 2013, 40(2): 117-119.
[14] DENG Shu-hua,ZHAO Ze-mao. Secure and Reliable Centralized Multicast Key Management Scheme [J]. Computer Science, 2011, 38(Z10): 50-52.
[15] MAO Ming,HE Qiang,ZENG Shao-kun,ZHANG Jun. Security Analysis of Resistance against Differential-linear Attack on BLAKE-32 [J]. Computer Science, 2011, 38(7): 76-79.
Full text



No Suggested Reading articles found!