Computer Science ›› 2022, Vol. 49 ›› Issue (9): 318-325.doi: 10.11896/jsjkx.220300190

• Information Security • Previous Articles     Next Articles

Privacy-preserving Linear Regression Scheme and Its Application

LYU You1,2, WU Wen-yuan1   

  1. 1 Chongqing Institute of Green and Intelligent Technology,Chinese Academy of Sciences,Chongqing 400714,China
    2 University of Chinese Academy Sciences,Beijing 100049,China
  • Received:2022-03-19 Revised:2022-06-03 Online:2022-09-15 Published:2022-09-09
  • About author:LYU You,born in 1996,postgraduate.His main research interests include homomorphic encryption and information security.
    WU Wen-yuan,born in 1976,Ph.D,professor.His main research interests include lattice based cryptography,automated reasoning and symbolic computation.
  • Supported by:
    National Key Research and Development Project(2020YFA0712303),Guizhou Science and Technology Program([2020]4Y056) and Chongqing Science and Technology Program(cstc2020yszx-jcyjX0005).

Abstract: Linear regression is an important and widely used machine learning algorithm.The training of linear regression model usually depends on a large amount of data.In reality,the data set is generally held by different users and contains their privacy information.When multiple users want to gather more data to train a better model,it inevitably involves users' privacy.As a privacy protection technology,homomorphic encryption can effectively solve the problem of privacy leakage in computing.A new privacy preserving linear regression scheme based on hybrid iterative method is designed for the scenario where data sets are distri-buted horizontally on two users.The scheme is divided into two stages.The first stage implements the statistic gradient descent algorithm in the ciphertext domain.In the second stage,a secure two-party fast descent protocol is designed.The core idea of the protocol is based on Jacobi iterative method,which can effectively make up for the poor convergence effect of gradient descent method in practical application,accelerate the convergence of the model,and protect the data privacy of two users while effectively training the linear regression model.The efficiency,communication loss and security of the scheme are analyzed.The scheme is implemented by using C++and applied to real data sets.A large number of experimental results show that the scheme can effectively solve the linear regression problem with large scale features.The relative error of decision coefficient is less than 0.001,which show that the application effect of the privacy preserving linear regression model in real data set is close to that obtained directly from unencrypted data,and the scheme can meet the practical application requirements in specific scenarios.

Key words: Privacy-preserving, Linear regression, Hybrid iterative method, Homomorphic encryption

CLC Number: 

  • TP309.7
[1]YANG Q,LIU Y,CHEN T J,et al.Federated Machine Lear-ning:Concept and Applications[J].ACM Transactions on Intelligent Systems Technology,2019,10(2):1-19.
[2]BOULEMTAFES A,DERHAB A,CHALLAL Y.A review of privacy-preserving techniques for deep learning[J].Neurocomputing,2020,384:21-45.
[3]TREVOR H,ROBERT T,JEROME F.The elements of statistical learning:data mining,inference and prediction[M].Sprin-ger,2009.
[4]YAO A C C.How to generate and exchange secrets[C]//Proceedings of the 27th Annual Symposium on Foundations of Computer Science.IEEE Computer Society,1986:162-167.
[5]PAILLIER P.Public-key cryptosystems based on composite degree residuosity classes[C]//Proceedings of the 17th International Conference on Theory and Application of Cryptographic Techniques.Springer-Verlag,1999:223-238.
[6]LYNBASHEVSKY V,PEIKERT C,REGEV O.On Ideal Lattices and Learning with Errors over Rings[M]//Advances in Cryptology-Eurocrypt 2010.Berlin:Springer,2010:1-23.
[7]GENTRY C.Fully Homomorphic Encryption Using Ideal Lattices[M]//Proceedings of the 2009 Acm Symposium on Theory of Computing(Stoc'09).2009:169-178.
[8]BRAKERSKI Z,GENTRY C,VAIKUNTANATHAN V.(Le-veled) Fully Homomorphic Encryption without Bootstrapping[J].ACM Trans. Comput. Theory,2014,6(3):1-36.
[9]FAN J,VERCAUTEREN F.Somewhat practical fully homo-morphic encryption[J].IACR Cryptology ePrint Archive,2012,144:1-19.
[10]CHEON J H,KIM A,KIM M,et al.Homomorphic Encryption for Arithmetic of Approximate Numbers[M]//Advances in Cryptology-Asiacrypt 2017,Pt I.Cham:Springer,2017:409-437.
[11]PAYMAN M,YUPENG Z.SecureML:A System for Scalable Privacy-Preserving Machine Learning[C]//2017 IEEE Symposium on Security and Privacy.2017:19-38.
[12]PAYMAN M,PETER R.ABY3:A Mixed Protocol Framework for Machine Learning[C]//Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security.Association for Computing Machinery,2018:35-52.
[13]NIKOLAENKO V,WEINSBERG U,IOANNIDIS S,et al.Privacy-Preserving Ridge Regression on Hundreds of Millions of Records[C]//2013 IEEE Symposium on Security and Privacy.IEEE Access,2013:334-348.
[14]GIACOMELLI I,JHA S,JOYE M,et al.Privacy-PreservingRidge Regression with only Linearly-Homomorphic Encryption[M]//Applied Cryptography and Network Security.Cham:Springer,2018:243-261.
[15]AKAVIA A,SHAUL H,WEISS M,et al.Linear-Regression on Packed Encrypted Data in the Two-Server Model[C]//Procee-dings of the 7th ACM Workshop on Encrypted Computing &Applied Homomorphic Cryptography.Association for Computing Machinery,2019:21-32.
[16]HU S,WANG Q,WANG J,et al.Securing Fast Learning!Ridge Regression over Encrypted Big Data[M]//2016 IEEE Trustcom/BigDataSE/ISPA.IEEE Access,2016:19-26.
[17]LU L,DING N.Horizontal Privacy-Preserving Linear Regression Which is Highly Efficient for Dataset of Low Dimension[C]//Proceedings of the 2021 ACM Asia Conference on Computer and Communications Security.Association for Computing Machinery,2021:604-615.
[18]LYU Y,WU W Y.Linear System Solving Scheme Based on Homomorphic Encryption[J].Computer Science,2022,49(3):338-345.
[19]SMART N P,VERCAUTEREN F.Fully homomorphic SIMDoperations[J].Designs,Codes and Cryptography,2012,71(1):57-81.
[20]KIM A,SONG Y,KIM M,et al.Logistic regression model trai-ning based on the approximate homomorphic encryption[J].BMC Medical Genomics,2018,11(4):23-31.
[21]CHEON J H,HAN K,KIM A,et al.Bootstrapping for Approximate Homomorphic Encryption[M]//Advances in Cryptology-Eurocrypt 2018,Pt I.Cham:Springer,2018:360-384.
[22]CHEON J H,HAN K,KIM A,et al.A Full RNS Variant of Approximate Homomorphic Encryption[C]//Selected Areas in Cryptography-SAC 2018.Cham:Springer,2019:347-368.
[23]GENTRY C,HALEVI S,SMART N P.Homomorphic Evalua-tion of the AES Circuit[M]//Advances in Cryptology-Crypto 2012.Berlin:Springer,2012:850-867.
[24]LINDNER R,PEIKERT C.Better Key Sizes(and Attacks) for LWE-Based Encryption[M]//Topics in Cryptology-Ct-Rsa 2011.Berlin:Springer,2011:319-339.
[25]CHEON J H,KIM A,KIM M,et al.Implementation of HEAAN[OL].https://github.com/kimandrik/HEAAN.
[26]UCI.Machine Learning Repository [OL].http://archive.ics.uci.edu/ml/datasets.php.
[27]TIANCHI.Tianchi Data Sets [OL].https://tianchi.aliyun.com/dataset/.
[1] LI Li, HE Xin, HAN Zhi-jie. Review of Privacy-preserving Mechanisms in Crowdsensing [J]. Computer Science, 2022, 49(5): 303-310.
[2] QIN Xiao-yue, HUANG Ru-wei, YANG Bo. NTRU Type Fully Homomorphic Encryption Scheme over Prime Power Cyclotomic Rings [J]. Computer Science, 2022, 49(5): 341-346.
[3] LYU You, WU Wen-yuan. Linear System Solving Scheme Based on Homomorphic Encryption [J]. Computer Science, 2022, 49(3): 338-345.
[4] REN Hua, NIU Shao-zhang, WANG Mao-sen, YUE Zhen, REN Ru-yong. Homomorphic and Commutative Fragile Zero-watermarking Based on SVD [J]. Computer Science, 2022, 49(3): 70-76.
[5] CHEN Chang-wei, ZHOU Xiao-feng. Fast Local Collaborative Representation Based Classifier and Its Applications in Face Recognition [J]. Computer Science, 2021, 48(9): 208-215.
[6] ZHANG Xiao-yan, LI Qin-wei, FU Fu-jie. Secret Verification Method of Blockchain Transaction Amount Based on Digital Commitment [J]. Computer Science, 2021, 48(9): 324-329.
[7] JI Yan, DAI Hua, JIANG Ying-ying, YANG Geng, Yi Xun. Parallel Multi-keyword Top-k Search Scheme over Encrypted Data in Hybrid Clouds [J]. Computer Science, 2021, 48(5): 320-327.
[8] WANG Rui-jin, TANG Yu-cheng, PEI Xi-kai, GUO Shang-tong, ZHANG Feng-li. Block-chain Privacy Protection Scheme Based on Lightweight Homomorphic Encryption and Zero-knowledge Proof [J]. Computer Science, 2021, 48(11A): 547-551.
[9] YANG Zhang-jing, WANG Wen-bo, HUANG Pu, ZHANG Fan-long, WANG Xin. Local Weighted Representation Based Linear Regression Classifier and Face Recognition [J]. Computer Science, 2021, 48(11A): 351-359.
[10] SHAO Zheng-yi, CHEN Xiu-hong. Sample Feature Kernel Matrix-based Sparse Bilinear Regression [J]. Computer Science, 2021, 48(10): 185-190.
[11] LI Yan-bin, LIU Yu, LI Mu-zhou, WU Ren-tao, WANG Peng-da. Participant-adaptive Variant of MASCOT [J]. Computer Science, 2020, 47(11A): 380-387.
[12] CAI Wei, BAI Guang-wei, SHEN Hang, CHENG Zhao-wei, ZHANG Hui-li. Reinforcement Learning Based Win-Win Game for Mobile Crowdsensing [J]. Computer Science, 2020, 47(10): 41-47.
[13] WANG Tong, MA Wen-ping, LUO Wei. Information Sharing and Secure Multi-party Computing Model Based on Blockchain [J]. Computer Science, 2019, 46(9): 162-168.
[14] LI Meng-tian, HU Bin. RLWE-based Fully Homomorphic Encryption Scheme with Batch Technique [J]. Computer Science, 2019, 46(3): 209-216.
[15] ZHOU Ming,JIA Yan-ming,ZHOU Cai-lan,XU Ning. English Automated Essay Scoring Methods Based on Discourse Structure [J]. Computer Science, 2019, 46(3): 234-241.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!