Computer Science ›› 2024, Vol. 51 ›› Issue (8): 387-395.doi: 10.11896/jsjkx.230800177

• Information Security • Previous Articles     Next Articles

Privacy-preserving Principal Component Analysis Based on Homomorphic Encryption

ZHANG Jindou, CHEN Jingwei, WU Wenyuan, FENG Yong   

  1. Chongqing School,University of Chinese Academy of Sciences,Chongqing 400714,China
  • Received:2023-08-28 Revised:2024-05-31 Online:2024-08-15 Published:2024-08-13
  • About author:ZHANG Jindou,born in 1997,postgra-duate.His main research interests include homomorphic encryption and information security.
    CHEN Jingwei,born in 1984,Ph.D,professor.His main research interests include error-free numerical computation and lattice-based cryptography.
  • Supported by:
    Ministry of Science and Technology Key Research and Development Programs(2020YFA0712300),Natural Science Foundation of Chongqing(cstc2021jcyj-msxmX0821,CSTB2023NSCQ-MSX0441,cstc2021yszx-jcyjX0004,2022YSZX-JCX0011CSTB,CSTB2023YSZX-JCX0008) and Western Light of the West Young Scholars program of the Chinese Academy of Sciences.

Abstract: In real life,data is not interconnected between different industries,or even between different departments within the same industry.With the improvement of computer computing power,it is not computing power but data volume that restricts the effectiveness of model training.Therefore,in order to obtain a better algorithm model,relying solely on one party’s data is not enough.It needs the participation of two or more parties,which requires privacy protection for all parties.In addition,as data collection becomes more detailed,the data dimension also increases.For high dimension data,dimension reduction is an indispensable step.And in terms of dimension reduction,principal component analysis(PCA) is a commonly used method.Homomorphic encryption is a solution when two parties want to collaborate on privacy protection data dimension reduction.Homomorphic encryption can compute encrypted data while protecting data privacy,and can be used to compute the PCA on encrypted data.In this paper,a two party encrypted data PCA scheme is designed using the CKKS homomorphic encryption scheme and the power method for dominant eigenvectors,achieving the goal of dimension reduction while protecting the privacy of both parties’ data.By improving the traditional power method iteration steps,the expensive homomorphic ciphertext division is avoided,allowing for more iterations with small encryption parameters,thereby reducing the computing time and improving the accuracy of the computed results.Through testing on public datasets and comparing it with some existing schemes,the scheme reduces the computational time by about 80%,and reduces the mean squared error to within 1% compared to the plaintext computation results.

Key words: Homomorphic encryption, Privacy preserving, Principal component analysis, Singular value decomposition, Power method

CLC Number: 

  • TP309.7
[1]TRAUTMAN L J,ORMEROD P C.Corporate directors’ and officers’ cybersecurity standard of care:The Yahoo data breach [J].AM UL Rev,2016,66:1231.
[2]PAULINA O.Russian Web hosting provider exposes data ofmore than 54M users [EB/OL].[2024-02-21].https://cybernews.com/security/web-hosting-ucoz-uid-data-leak.
[3]KRISHI C.Microsoft Azure Hit With The Largest Data Breach In Its History;Hundreds Of Executive Accounts Compromised [EB/OL].[2024-02-21].https://techreport.com/news/microsoft-azure-hit-with-the-largest-data-breach-in-its-history-hun-dreds-of-executive-accounts-compromised.
[4]NATALIE C.Bank of america customers left in the dark about data breach for 90 days [EB/OL].[2024-02-15].https://www.forbes.com/advisor/personal-finance/data-breach-affects-bank-of-america-customers.
[5]VOIGT P,VON DEM BUSSCHE A.The eu general data protection regulation(gdpr)[M].Springer,2017.
[6]BARRETT C.Are the EU GDPR and the California CCPA becoming the de facto global standards for data privacy and protection? [J].Scitech Lawyer,2019,15(3):24-29.
[7]PARASOL M.The impact of China’s 2016 Cyber Security Law on foreign technology firms,and on China’s big data and Smart City dreams [J].Computer law & security review,2018,34(1):67-98.
[8]LI F H,LI H,JIA Y,et al.Research Scope and Development Trends in Privacy Computing [J].Journal on Communications,2016,37(4):4.
[9]LI T,SAHU A K,TALWALKAR A,et al.Federated learning:Challenges,methods,and future directions [J].IEEE Signal Processing Magazine,2020,37(3):50-60.
[10]SABT M,ACHEMLAL M,BOUABDALLAH A.Trusted execution environment:What it is,and what it is not[C]//2015 IEEE Trustcom/BigDataSE/Ispa.IEEE,2015,1:57-64.
[11]YAO A C.Protocols for secure computations[C]//23rd Annual Symposium on Foundations of Computer Science(SFCS 1982).IEEE,1982:160-164.
[12]DURGA PRASAD K,ADI NARAYANA REDDY K,VA-SUMATHI D.Privacy-Preserving Naive Bayesian Classifier for Continuous Data and Discrete Data[C]//First International Conference on Artificial Intelligence and Cognitive Computing(AICC 2018).Springer Singapore,2019:289-299.
[13]GILAD-BACHRACH R,DOWLIN N,LAINE K,et al.Cryptonets:Applying neural networks to encrypted data with high throughput and accuracy[C]//International Conference on Machine Learning.PMLR,2016:201-210.
[14]HESAMIFARD E,TAKABI H,GHASEMI M.Cryptodl:Deep neural networks over encrypted data [J].arXiv:1711.05189,2017.
[15]SANYAL A,KUSNER M,GASCON A,et al.TAPAS:Tricks to accelerate(encrypted) prediction as a service[C]//International Conference on Machine Learning.PMLR,2018:4490-4499.
[16]BOURSE F,MINELLI M,MINIHOLD M,et al.Fast homomorphic evaluation of deep discretized neural networks[C]//Advances in Cryptology(CRYPTO 2018):38th Annual International Cryptology Conference.Santa Barbara,CA,USA,Part III 38.Springer International Publishing,2018:483-512.
[17]LU W,KAWASAKI S,SAKUMA J.Using fully homomorphic encryption for statistical analysis of categorical,ordinal and numerical data[J].Cryptology ePrint Archive,2016.
[18]RATHEE D,MISHRA P K,YASUDA M.Faster PCA andlinear regression through hypercubes in HElib[C]//Proceedings of the 2018 Workshop on Privacy in the Electronic Society.2018:42-53.
[19]BRAKERSKI Z,GENTRY C,VAIKUNTANATHAN V.(Le-veled) fully homomorphic encryption without bootstrapping [J].ACM Transactions on Computation Theory(TOCT),2014,6(3):1-36.
[20]CHEON J H,KIM A,KIM M,et al.Homomorphic encryption for arithmetic of approximate numbers[C]//Advances in Cryptology(ASIACRYPT 2017):23rd International Conference on the Theory and Applications of Cryptology and Information Security,Hong Kong,China,Part I 23.Springer International Publishing,2017:409-437.
[21]PANDA S.Principal component analysis using CKKs homomorphic scheme[C]//Cyber Security Cryptography and Machine Learning:5th International Symposium(CSCML 2021).Be’er Sheva,Israel,Springer International Publishing,2021:52-70.
[22]RIVEST R L,ADLEMAN L,DERTOUZOS M L.On databanks and privacy homomorphisms [J].Foundations of Secure Computation,1978,4(11):169-180.
[23]RIVEST R L,SHAMIR A,ADLEMAN L.A method for obtaining digital signatures and public-key cryptosystems [J].Communications of the ACM,1978,21(2):120-126.
[24]ELGAMAL T.A public key cryptosystem and a signaturescheme based on discrete logarithms [J].IEEE Transactions on Information Theory,1985,31(4):469-472.
[25]PAILLIER P.Public-key cryptosystems based on composite degree residuosity classes[C]//International Conference on the Theory and Applications of Cryptographic Techniques.Berlin,Heidelberg:Springer Berlin Heidelberg,1999:223-238.
[26]BONEH D,GOH E J,NISSIM K.Evaluating 2-DNF formulason ciphertexts[C]//Theory of Cryptography:Second Theory of Cryptography Conference(TCC 2005).Cambridge,MA,USA,Springer Berlin Heidelberg,2005:325-341.
[27]GENTRY C.Fully homomorphic encryption using ideal lattices[C]//Proceedings of the forty-first annual ACM Symposium on Theory of Computing.2009:169-178.
[28]FAN J,VERCAUTEREN F.Somewhat practical fully homo-morphic encryption[J/OL].https://eprint.iacr.org/2012/144.pdf.
[29]GENTRY C,SAHAI A,WATERS B.Homomorphic encryption from learning with errors:Conceptually-simpler,asymptotically-faster,attribute-based[C]//Advances in Cryptology-CRYPTO 2013:33rd Annual Cryptology Conference,Santa Barbara,CA,USA,Part I.Springer Berlin Heidelberg,2013:75-92.
[30]LYUBASHEVSKY V,PEIKERT C,REGEV O.On ideal lattices and learning with errors over rings[C]//Advances in Cryptology(EUROCRYPT 2010):29th Annual International Conference on the Theory and Applications of Cryptographic Techniques.French Riviera,Springer,2010:1-23.
[31]HALEVI S,SHOUP V.Algorithms in helib[C]//Advances in Cryptology-CRYPTO 2014:34th Annual Cryptology Confe-rence,Santa Barbara,CA,USA,Part I 34.2014:554-571.
[32]LINDELL Y.How to simulate it-A tutorial on the simulationproof technique[M]//Tutorials on the Foundations of Cryptography Information Security and Cryptography.2017:277-346.
[33]ALBRECHT M R,PLAYER R,SCOTT S.On the concretehardness of learning with errors [J].Journal of Mathematical Cryptology,2015,9(3):169-203.
[34]CORTEZ P,CERDEIRA A,ALMEIDA F,et al.Modeling wine preferences by data mining from physicochemical properties [J].Decision Support Systems,2009,47(4):547-53.
[35]BELHUMEUR P N,HESPANHA J P,KRIEGMAN D J.Eigenfaces vs.fisherfaces:Recognition using class specific linear projection [J].IEEE Transactions on Pattern Analysis and Machine Intelligence,1997,19(7):711-720.
[36]LECUN Y,CORTES C,BURGES C.MNIST handwritten digit database.AT&T Labs(2010) [EB/OL].http://yann lecun com/exdb/mnist.
[37]XIAO H,RASUL K,VOLLGRAF R.Fashion-mnist:a novelimage dataset for benchmarking machine learning algorithms [J].arXiv:1708.07747,2017.
[38]DE VITO S,MASSERA E,PIGA M,et al.On field calibration of an electronic nose for benzene estimation in an urban pollution monitoring scenario [J].Sensors and Actuators B:Chemical,2008,129(2):750-757.
[39]TSANAS A,LITTLE M,MCSHARRY P,et al.Accurate telemonitoring of Parkinson’s disease progression by non-invasive speech tests[J/OL].Nature Proceedings,2009.https://doi.org/10.1038/npre.2009.3920.1.
[1] SUN Jianming, ZHAO Mengxin. Survey of Application of Differential Privacy in Edge Computing [J]. Computer Science, 2024, 51(6A): 230700089-9.
[2] ZANG Hongrui, YANG Tingting, LIU Hongbo, MA Kai. Study on Cryptographic Verification of Distributed Federated Learning for Internet of Things [J]. Computer Science, 2024, 51(6A): 230700217-5.
[3] DING Hongfa, YU Yingying, JIANG Heling. Correctness Verifiable Outsourcing Computation Scheme of Shortest Path Querying over Encrypted Graph Data [J]. Computer Science, 2024, 51(5): 400-413.
[4] LU Xingyuan, CHEN Jingwei, FENG Yong, WU Wenyuan. Privacy-preserving Data Classification Protocol Based on Homomorphic Encryption [J]. Computer Science, 2023, 50(8): 321-332.
[5] GAO Guangyong, HAN Tingting, XIA Zhihua. Recoverable Data Aggregation Protocol for Wireless Sensor Networks Based on Reversible DigitalWatermarking [J]. Computer Science, 2023, 50(8): 333-341.
[6] LI Kejia, HU Xuexian, CHEN Yue, YANG Hongjian, XU Yang, LIU Yang. Differential Privacy Linear Regression Algorithm Based on Principal Component Analysis andFunctional Mechanism [J]. Computer Science, 2023, 50(8): 342-351.
[7] WANG Shaohui, ZHAO Zhengyu, WANG Huaqun, XIAO Fu. Analysis and Improvement on Identity-based Remote Data Integrity Verification Scheme [J]. Computer Science, 2023, 50(7): 302-307.
[8] ZHAO Min, TIAN Youliang, XIONG Jinbo, BI Renwan, XIE Hongtao. Neural Network Model Training Method Based on Homomorphic Encryption [J]. Computer Science, 2023, 50(5): 372-381.
[9] ZHAO Yuqi, YANG Min. Review of Differential Privacy Research [J]. Computer Science, 2023, 50(4): 265-276.
[10] WANG Shan, LIU Lu. Soil Moisture Data Reconstruction Based on Low Rank Matrix Completion Method [J]. Computer Science, 2023, 50(11A): 230300073-6.
[11] ZHAO Zitian, ZHAN Wenhan, DUAN Hancong, WU Yue. Study on Adversarial Robustness of Deep Learning Models Based on SVD [J]. Computer Science, 2023, 50(10): 362-368.
[12] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[13] LYU You, WU Wen-yuan. Privacy-preserving Linear Regression Scheme and Its Application [J]. Computer Science, 2022, 49(9): 318-325.
[14] LI Qi-ye, XING Hong-jie. KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion [J]. Computer Science, 2022, 49(8): 267-272.
[15] QUE Hua-kun, FENG Xiao-feng, LIU Pan-long, GUO Wen-chong, LI Jian, ZENG Wei-liang, FAN Jing-min. Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection [J]. Computer Science, 2022, 49(6A): 790-794.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!