计算机科学 ›› 2020, Vol. 47 ›› Issue (9): 318-323.doi: 10.11896/jsjkx.190800139

• 信息安全 • 上一篇    下一篇

面向加密云数据的多关键字语义搜索方法

李彦, 申德荣, 聂铁铮, 寇月   

  1. 东北大学计算机科学与工程学院 沈阳110169
  • 收稿日期:2019-08-28 发布日期:2020-09-10
  • 通讯作者: 申德荣(shendr@mail.neu.edu.cn)
  • 作者简介:nanou1995@163.com
  • 基金资助:
    国家自然科学基金(61672142,U1811261);国家重点研发项目(2018YFB1003404);中央高校基本科研业务费(N171606005)

Multi-keyword Semantic Search Scheme for Encrypted Cloud Data

LI Yan, SHEN De-rong, NIE Tie-zheng, KOU Yue   

  1. College of Computer Science and Engineering,Northeastern University,Shenyang 110169,China
  • Received:2019-08-28 Published:2020-09-10
  • About author:LI Yan,born in 1995,postgraduate.His main research interests include semantic search and query processing.
    SHEN De-rong,born in 1964,professor,Ph.D,supervisor,is a senior member of China Computer Federation.Her research interests include Web data processing and distributed database.
  • Supported by:
    National Natural Science Foundation of China (61672142,U1811261),National Key R&D Program of China (2018YFB1003404) and Fundamental Research Funds for the Central Universities (N171606005).

摘要: 由于云服务具有灵活性、通用性和低成本等特性,将数据交由云服务器管理变得日益普遍。然而,云服务器不是完全可信的,因此将加密数据交由云服务器管理并支持加密搜索成为了当前研究的热点问题之一。加密虽然能够很好地保护数据隐私安全,但是会掩盖数据本身的语义信息,加大搜索难度。文中面向加密云数据提出了一种支持多关键字的安全语义搜索解决方案,其核心思想是基于主题模型获取文档的主题向量和主题的词分布向量,通过计算查询关键字与各个主题的语义相似度生成查询向量,支持在同一向量空间内评价查询向量与文档主题向量的相似度;提出了基于EMD并结合词嵌入计算查询向量与主题相似度的方法,提升了查询关键词与主题之间语义相似度的准确性;为支持高效语义搜索,构建了主题向量索引树,并采用“贪婪搜索”算法优化关键字搜索。理论分析和实验结果表明:所提解决方案可实现安全的多关键字语义排序搜索,并且大大提高了搜索效率。

关键词: 查询处理, 加密可搜索, 隐私保护, 语义搜索, 云计算

Abstract: Due to the flexibility,versatility,and low cost of cloud services,it is common to hand over data to cloud server management.However,cloud servers are not completely trusted,so it is one of the hot issues in current research to transfer encrypted data to cloud servers and support encrypted search.Although encryption can protect data privacy and security,it will cover the semantic information of the data itself and increase the difficulty of searching.This paper proposes a secure semantic search solution for multi-keywords for encrypted cloud data.The core idea is to obtain the topic vector of the document and the word distribution vector of the topic based on the topic model,and calculate the query keyword to be similar to the semantics of each topic.The query vector is generated to support the similarity between the query vector and the document subject vector in the same vector space.The calculation method of calculating the similarity between the query vector and the topic based on EMD combined with word embedding is proposed to improve the accuracy of semantic similarity.To support efficient semantic search,a topic vector index tree is constructed and a "greedy search" algorithm is used to optimize keyword search.Finally,theoretical analysis and experimental results show that the proposed solution can achieve secure multi-keyword semantic sorting search and greatly improve search efficiency.

Key words: Cloud computing, Encryption searchable, Privacy protection, Query processing, Semantic search

中图分类号: 

  • TP391
[1] YASUDA M,SHIMOYAMA T,KOGURE J,et al.Secure Pattern Matching Using Somewhat Homomorphic Encryption[C]//Acm Workshop on Cloud Computing Security Workshop.2013:65-76.
[2] SUMALATHA N,NAGA S R.A Trusted Hardware Based Database with Privacy and Data Confidentiality[J].IEEE Transactions on Knowledge and Data Engineering,2014,26(3):752-765.
[3] BLEI D M,NG A Y,JORDAN M I.Latent Dirichlet Allocation[J].Journal of Machine Learning Research Archive,2003(3):993-1022.
[4] RUBNER Y,TOMASI C,GUIBAS L J.The Earth Mover’sDistance as a Metric for Image Retrieval[J].International Journal of Computer,2000,40(2):99-121.
[5] SONG D.Practical Techniques for Searches on Encrypted Data[C]//Proc.of the 2000 IEEE Security and Privacy Symposium.2000:44-55.
[6] GOH E J.Building Secure Indexes for Searching Efficiently on Encrypted Compressed data[J].IACR Cryptology ePrint Archive,2003,10(7):216-234.
[7] CHANG Y C,MITZENMACHER M.Privacy Preserving Key-word Searches on Remote Encrypted Data[C]//International.
Conference on Applied Cryptography and Network Security.2004:442-455.
[8] CURTMOLA R.Searchable Symmetric Encryption:ImprovedDefinitions and Efficient Constructions[C]//ACM Conference on Computer and Communications Security.2006:79-88.
[9] BALLARD L,KAMARA S,MONROSE F.Achieving Efficient Conjunctive Keyword Searches over Encrypted Data[C]//7th International Conference on Information and Communications Security.2005:414-426.
[10] CAO N,WANG C,LI M,et al.Privacy-preserving Multi-keyword Ranked Search over Encrypted Cloud Data[C]//2011 Proceedings IEEE INFOCOM.2011:829-837.
[11] SUN W,WANG B,CAO N,et al.Privacy-preserving Multi-keyword Text Search in the Cloud Supporting Similarity-based Ranking[C]//Acm Sigsac Symposium on Information.2013:71-82.
[12] LI J,WANG Q,WANG C,et al.Fuzzy Keyword Search overEncrypted Data in Cloud Computingin[C]//IEEE Proc.INFOCOM.2010:1-5.
[13] WANG C,REN K,YU S,et al.Achieving Usable and Privacy-assured Similarity Search Over Outsourced Cloud Data[C]//IEEE International Conference on Computer Communications.2012:25-30.
[14] WANG B,YU S,LOU W,et al.Privacy-Preserving Multi-Keyword Fuzzy Search over Encrypted Data in the Cloud[C]//IEEE Conference on Computer Communications.2014:86-96.
[15] FU Z,WU X,GUAN C,et al.Toward Efficient Multi-Keyword Fuzzy Search Over Encrypted Outsourced Data With Accuracy Improvement[J].IEEE Transactions on Information Forensics and Security,2016,11(12):2706-2716.
[16] MOH T S,HO K H.Efficient Semantic Search Over Encrypted Data in Cloud Computing[C]//International Conference on High Performance Computing & Simulation.2014:382-390.
[17] ZHANG J F,WU X L,WANG Q,et al.Enabling Central Keyword-Based Semantic Extension Search Over Encrypted Outsourced Data[C]//IEEE Transactions on Information Forensics and Security.2017:2986-2997.
[18] NING J,XU J,LIANG K,et al.Passive Attacks AgainstSearchable Encryption[J].IEEE Transactions on Information Forensics and Security,2019,14(3):789-802.
[1] 鲁晨阳, 邓苏, 马武彬, 吴亚辉, 周浩浩.
基于分层抽样优化的面向异构客户端的联邦学习
Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients
计算机科学, 2022, 49(9): 183-193. https://doi.org/10.11896/jsjkx.220500263
[2] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[3] 吕由, 吴文渊.
隐私保护线性回归方案与应用
Privacy-preserving Linear Regression Scheme and Its Application
计算机科学, 2022, 49(9): 318-325. https://doi.org/10.11896/jsjkx.220300190
[4] 王健.
基于隐私保护的反向传播神经网络学习算法
Back-propagation Neural Network Learning Algorithm Based on Privacy Preserving
计算机科学, 2022, 49(6A): 575-580. https://doi.org/10.11896/jsjkx.211100155
[5] 李利, 何欣, 韩志杰.
群智感知的隐私保护研究综述
Review of Privacy-preserving Mechanisms in Crowdsensing
计算机科学, 2022, 49(5): 303-310. https://doi.org/10.11896/jsjkx.210400077
[6] 王美珊, 姚兰, 高福祥, 徐军灿.
面向医疗集值数据的差分隐私保护技术研究
Study on Differential Privacy Protection for Medical Set-Valued Data
计算机科学, 2022, 49(4): 362-368. https://doi.org/10.11896/jsjkx.210300032
[7] 高诗尧, 陈燕俐, 许玉岚.
云环境下基于属性的多关键字可搜索加密方案
Expressive Attribute-based Searchable Encryption Scheme in Cloud Computing
计算机科学, 2022, 49(3): 313-321. https://doi.org/10.11896/jsjkx.201100214
[8] 吕由, 吴文渊.
基于同态加密的线性系统求解方案
Linear System Solving Scheme Based on Homomorphic Encryption
计算机科学, 2022, 49(3): 338-345. https://doi.org/10.11896/jsjkx.201200124
[9] 孔钰婷, 谭富祥, 赵鑫, 张正航, 白璐, 钱育蓉.
基于差分隐私的K-means算法优化研究综述
Review of K-means Algorithm Optimization Based on Differential Privacy
计算机科学, 2022, 49(2): 162-173. https://doi.org/10.11896/jsjkx.201200008
[10] 金华, 朱靖宇, 王昌达.
视频隐私保护技术综述
Review on Video Privacy Protection
计算机科学, 2022, 49(1): 306-313. https://doi.org/10.11896/jsjkx.201200047
[11] 雷羽潇, 段玉聪.
面向跨模态隐私保护的AI治理法律技术化框架
AI Governance Oriented Legal to Technology Bridging Framework for Cross-modal Privacy Protection
计算机科学, 2021, 48(9): 9-20. https://doi.org/10.11896/jsjkx.201000011
[12] 王辉, 朱国宇, 申自浩, 刘琨, 刘沛骞.
基于用户偏好和位置分布的假位置生成方法
Dummy Location Generation Method Based on User Preference and Location Distribution
计算机科学, 2021, 48(7): 164-171. https://doi.org/10.11896/jsjkx.200800069
[13] 王政, 姜春茂.
一种基于三支决策的云任务调度优化算法
Cloud Task Scheduling Algorithm Based on Three-way Decisions
计算机科学, 2021, 48(6A): 420-426. https://doi.org/10.11896/jsjkx.201000023
[14] 潘瑞杰, 王高才, 黄珩逸.
云计算下基于动态用户信任度的属性访问控制
Attribute Access Control Based on Dynamic User Trust in Cloud Computing
计算机科学, 2021, 48(5): 313-319. https://doi.org/10.11896/jsjkx.200400013
[15] 季琰, 戴华, 姜莹莹, 杨庚, 易训.
面向混合云的可并行多关键词Top-k密文检索技术
Parallel Multi-keyword Top-k Search Scheme over Encrypted Data in Hybrid Clouds
计算机科学, 2021, 48(5): 320-327. https://doi.org/10.11896/jsjkx.200300160
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!