Computer Science ›› 2018, Vol. 45 ›› Issue (6): 51-56.doi: 10.11896/j.issn.1002-137X.2018.06.009

• WISA2021 • Previous Articles     Next Articles

Subsequence Outsourcing Query Method over Encrypted Genomic Data

WANG Zhan-bing, SONG Wei, PENG Zhi-yong, YANG Xian-di, CUI Yi-hui, SHEN Yuan   

  1. School of Computer,Wuhan University,Wuhan 430070,China
  • Online:2018-06-15 Published:2018-07-24

Abstract: Precision medicine is a medical model that relies heavily on patient genome analysis.The subsequence search plays an important role in performing genome analysis.Recently,the amount of genomic data are increasing dramatically,and the storage cost and processing complexity of them have been far beyond the capacity of hospitals.So,utilizing the powerful cloud computing capability to analyze and process such massive genomic sequence data is becoming popular.Considering that cloud service provider is not completely trusted,encrypting genomic data before uploading is a straightforward and effective solution to guarantee the privacy and security of DNA sequence data.However,how to perform queries over the encrypted genomic sequence data becomes another difficult problem.To address this problem,this paper made a detailed survey on genomic data processing and full-text retrieval fields.It constructed indexes on fix-length windows of the genomic sequence using q-gram mapping,and performed queries in every window.If the query sequence is the prefix of any window in genomic sequence,the query hits.Throughout all the processes,cloud service provider stores indexes and performs subsequence query,without obtaining any privacy details.Moreover,this paper set up the system model and several security assumptions,and proved their security.Experiments were carried out to evaluate the performance of scheme on a public dataset.The results show that the proposed solution achieves better performance in time cost and storage cost,i.e.when w is 100 and q is 6,the building index algorithm costs 15.60s for sequence of 100000 length.Compared with GPSE,the proposed solution has higher execution efficiency in performing queries.

Key words: Ciphertext query, Full-text query, Precision medicine, Subsequence query

CLC Number: 

  • TP309.2
[1]WHEELER D A,SRINIVASAN M,EGHOLM M,et al.The complete genome of an individual by massively parallel DNA sequencing[J].Nature,2008,452(7189):872-876.
[2]WANG J Y,WANG B,YANG X C.Efficient compressed genomic data oriented query approach[J].Journal of Software,2016,27(7):1715-1728.(in Chinese)
王佳英,王斌,杨晓春.面向压缩生物基因数据的高效的查询方法[J].软件学报,2016,27(7):1715-1728.
[3]SCHNEEBERGER K,HAGMANN J,OSSOWSKI S,et al. Simultaneous alignment of short reads against multiple genomes[J].Genome biology,2009,10(9):98.
[4]FERRADA H,GAGIE T,HIRVOLA T,et al.Hybrid indexes for repetitive datasets[J].Philosophical Transactions of the Royal Society a Mathematical Physical and Engineering Scien-ces,2013,372(372):20130137.
[5]KOBORI Y,MIZUTA S.Similarity Estimation Between DNA Sequences Based on Local Pattern Histograms of Binary Images[J].Genomics,Proteomics & Bioinformatics,2016,14(2):103-112.
[6]CLAUDE F,FARINA A,MARTÍNEZ-PRIETO M A,et al. Compressed q-gram indexing for highly repetitive biological sequences[C]//IEEE International Conference on Bioinformatics and Bioengineering.2010:86-91.
[7]YANG X,WANG B,LI C,et al.Efficient direct search on compressed genomic data[C]//IEEE International Confernece on Data Engineering.2013:961-972.
[8]AYDAY E,HUBAUX J P.Privacy and Security in the Genomic Era[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.ACM,2016:1863-1865.
[9]AZIZ A,MOMIN M,HASAN M Z,et al.Secure and efficient multiparty computation on genomic data[C]//Proceedings of the 20th International Database Engineering & Applications Symposium.ACM,2016:278-283.
[10]BANERJEE S S,ATHREYA A P,MAINZER L S,et al.Efficient and Scalable Workflows for Genomic Analyses[C]//Proceedings of the ACM International Workshop on Data-Intensive Distributed Computing.ACM,2016:27-36.
[11]CERI S,KAITOUA A,MASSEROLI M,et al.Data Management for Next Generation Genomic Computing[C]//EDBT.2016:485-490.
[12]FRIZZO-BARKER J,CHOW-WHITE P A,CHARTERS A,et al.Genomic Big Data and Privacy:Challenges and Opportunities for Precision Medicine[J].Computer Supported Cooperative Work (CSCW),2016,25(2-3):115-136.
[13]QIN Y,YALAMANCHILI H K,QIN J,et al.The current status and challenges in computational analysis of genomic big data[J].Big Data Research,2015,2(1):12-18.
[14]KANG S,AUNG K M M,VEERAVALLI B.Towards Secure and Fast Mapping of Genomic Sequences on Public Clouds[C]//Proceedings of the 4th ACM International Workshop on Security in Cloud Computing.ACM,2016:59-66.
[15]KOBORI Y,MIZUTA S.Similarity Estimation Between DNA Sequences Based on Local Pattern Histograms of Binary Images[J].Genomics,Proteomics & Bioinformatics,2016,14(2):103-112.
[16]CLAUDE F,FARINA A,MARTÍNEZ-PRIETO M A,et al. Compressed q-gram indexing for highly repetitive biological sequences[C]//IEEE International Conference on Bioinformatics and Bioengineering.2010:86-91.
[17]SONG W,WANG B,WANG Q,et al.A privacy-preserved full-text retrieval algorithm over encrypted data for cloud storage applications[J].Journal of Parallel and Distributed Computing,2017,99:14-27.
[18]WANG B,SONG W,LOU W,et al.Inverted index based multi-keyword public-key searchable encryption with strong privacy guarantee[C]//IEEE Conference on Computer Communiations.2015:2092-2100.
[19]WANG D,JIA X,WANG C,et al.Generalized pattern matching string search on encrypted data in cloud systems[C]//IEEE Conference on Computer Communiations.2015:2101-2109.
[1] LIU Feng, ZHANG Jia-hao, ZHOU Jun-jie, LI Mu, KONG De-li, YANG Jie, QI Jia-yin, ZHOU Ai-min. Novel Hash-time-lock-contract Based Cross-chain Token Swap Mechanism of Blockchain [J]. Computer Science, 2022, 49(1): 336-344.
[2] LIU Feng, WANG Yi-fan, YANG Jie, ZHOU Ai-min, QI Jia-yin. Blockchain-based High-threshold Signature Protocol Integrating DKG and BLS [J]. Computer Science, 2021, 48(11): 46-53.
[3] HU Teng, WANG Yan-ping, ZHANG Xiao-song, NIU Wei-na. Data and Behavior Analysis of Blockchain-based DApp [J]. Computer Science, 2021, 48(11): 116-123.
[4] DONG Xiao-mei, WANG Rui, ZOU Xin-kai. Survey on Privacy Protection Solutions for Recommended Applications [J]. Computer Science, 2021, 48(9): 21-35.
[5] LI Na-na, WANG Yong, ZHOU Lin, ZOU Chun-ming, TIAN Ying-jie, GUO Nai-wang. DDoS Attack Random Forest Detection Method Based on Secondary Screening of Feature Importance [J]. Computer Science, 2021, 48(6A): 464-467.
[6] ZHANG Yan-mei, LOU Yin-cheng. Deep Neural Network Based Ponzi Scheme Contract Detection Method [J]. Computer Science, 2021, 48(1): 273-279.
[7] CHENG Qing-feng, LI Yu-ting, LI Xing-hua, JIANG Qi. Research on Application of Cryptography Technology for Edge Computing Environment [J]. Computer Science, 2020, 47(11): 10-18.
[8] LI Yan-bin, LIU Yu, LI Mu-zhou, WU Ren-tao, WANG Peng-da. Participant-adaptive Variant of MASCOT [J]. Computer Science, 2020, 47(11A): 380-387.
[9] BAI Li-fang, ZHU Yue-fei, LU Bin. Research and Development of Data Storage Security Audit in Cloud [J]. Computer Science, 2020, 47(10): 290-300.
[10] REN Shuai, WANG Meng, FAN Ao-xiong, GAO Ze, XU Jie, Shahzad KHURRAM, ZHANG Tao. Zero-high-resolution Information Hiding Algorithm for 3D Mesh Model [J]. Computer Science, 2020, 47(7): 328-334.
[11] FENG Tao, JIAO Ying, FANG Jun-li, TIAN Ye. Medical Health Data Security Model Based on Alliance Blockchain [J]. Computer Science, 2020, 47(4): 305-311.
[12] CHEN Jia,OUYANG Jin-yuan,FENG An-qi,WU Yuan,QIAN Li-ping. DoS Anomaly Detection Based on Isolation Forest Algorithm Under Edge Computing Framework [J]. Computer Science, 2020, 47(2): 287-293.
[13] LI Zhao-can, WANG Li-ming, GE Si-jiang, MA Duo-he, QIN Bo. Big Data Plain Text Watermarking Based on Orthogonal Coding [J]. Computer Science, 2019, 46(12): 148-154.
[14] LI Sen-you, JI Xin-sheng, YOU Wei, ZHAO Xing. Hierarchical Control Strategy for Data Querying Based on Differential Privacy [J]. Computer Science, 2019, 46(11): 130-136.
[15] WANG Tong, MA Wen-ping, LUO Wei. Information Sharing and Secure Multi-party Computing Model Based on Blockchain [J]. Computer Science, 2019, 46(9): 162-168.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!