Computer Science ›› 2023, Vol. 50 ›› Issue (8): 342-351.doi: 10.11896/jsjkx.220800255

• Information Security • Previous Articles     Next Articles

Differential Privacy Linear Regression Algorithm Based on Principal Component Analysis andFunctional Mechanism

LI Kejia1, HU Xuexian1, CHEN Yue1, YANG Hongjian1, XU Yang1, LIU Yang1,2   

  1. 1 The PLA Strategic Support Force Information Engineering University,Zhengzhou 450001,China
    2 School of Cyber Science and Engineering,Zhengzhou University,Zhengzhou 450001,China
  • Received:2022-08-29 Revised:2023-03-23 Online:2023-08-15 Published:2023-08-02
  • About author:LI Kejia,born in 1997,postgraduate.Her main research interests include fe-derated learning,differential privacy,and blockchain.
    HU Xuexian,born in 1982,Ph.D,asso-ciate professor,Ph.D supervisor.His main research interests include big data security,applied cryptography,and network security.
  • Supported by:
    National Natural Science Foundation of China(62172433,62172434,61862011,61872449).

Abstract: With the continuous development of artificial intelligence applications and the subsequent promulgation of privacy protection laws and regulations,the privacy protection issue in machine learning has become a research hotspot in the field of information security.To overcome the issues of high global sensitivity and poor model usability of the existing differential privacy linear regression algorithms,we present a differential privacy linear regression algorithm based on principal component analysis and functional mechanism(PCAFM-DPLR).In the PCAFM-DPLR algorithm,the traditional Laplace mechanism is replaced by the Gaussian mechanism,and the noise is added in the two major stages of the algorithm respectively.First,in order to take into account the privacy of the data while reducing the dimensionality,Gaussian noise is injected into the covariance matrix of the original data set,and a low-dimensional data set with differential privacy protection effect is obtained based on principal component analysis.Second,to prevent the possible privacy leakage during the model training,Gaussian noise is then added to the expansion polynomial coefficients of the objective function,and the minimization of the perturbed objective function is used as the objective to find the optimal model parameters.Theoretical analysis and experimental results show that the linear regression model trained by the PCAFM-DPLR algorithm can effectively guarantee privacy while having good utility.

Key words: Differential privacy, Principal component analysis, Functional mechanism, Linear regression, Gaussian noise

CLC Number: 

  • TP391
[1]FREDRIKSON M,JHA S,RISTENPART T.Model inversion attacks that exploit confidence information and basic countermeasures[C]//Proceedings of the 22nd ACM SIGSAC Confe-rence on Computer and Communications Security.New York:ACM,2015:1322-1333.
[2]DWORK C.Differential privacy[C]//Proceedings of the 33rd International Colloquium on Automata,Languages and Programming.Berlin:Springer,2006:1-12.
[3]DWORK C.Differential privacy:A survey of results[C]//Proceedings of the 5th International Conference on Theory and Applications of Models of Computation.Berlin:Springer,2008:1-19.
[4]ZHANG Y X,WEI J H,LI J,et al.Graph degree histogrampublication method with node-differential privacy[J].Journal of Computer Research and Development,2019,56(3):508-520.
[5]CORMODE G,PROCOPIUC M,SHEN E,et al.Differentiallyprivate spatial decompositions[C]//Proceedings of the IEEE 28th International Conference on Data Engineering.Piscataway,NJ:IEEE,2012:20-31.
[6]BHASKAR R,LAXMAN S,SIMTH A,et al.Discovering frequent patterns in sensitive data[C]//Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.New York:ACM,2010:503-512.
[7]MOHAMMED N,CHEN R,FUNG B,et al.Differentially private data release for data mining[C]//Proceedings of the ACM SIGKDDInternational Conference on Knowledge Discovery and Data Mining.New York:ACM,2011:493-501.
[8]NISSIM K,RASKHODNIKOVA S,SMITH A.Smooth sensitivity and sampling in private data analysis[C]//Proceedings of the 39th Annual ACM Symposium on Theory of Computing.New York:ACM,2007:11-13.
[9]DWORK C,MCSHERRY F,NISSIM K,et al.Calibrating noise to sensitivity in private data analysis[C]//Proceedings of the 3rd Theory of Cryptography Conference.Berlin:Springer,2006:265-284.
[10]FREDRIKSON M,LANTZ E,JHA S,et al.Privacy in Pharmacogenetics:An End-to-End Case Study of Personalized Warfarin Dosing[C]//Proceedings of the USENIX Security Symposium.Berkeley,CA:USENIX,2014:17-32.
[11]WU X,FREDRIKSON M,WU W T,et al.Revisiting differen-tially private regression:Lessons from learning theory and their consequences[J].arXiv:1512.06388,2015.
[12]SMITH A.Privacy-preserving statistical estimation with opti-mal convergence rate[C]//Proceedings of the 43th Annual ACM Symposium on Theory of Computing.New York:ACM,2011:813-822.
[13]LEI J.Differentially private m-estimators[C]//Proceedings of the 23rd Annual Conference on Neural Information Processing Systems.New York:ACM,2011:361-369.
[14]GONG M G,PAN K,XIE Y.Differential privacy preservation in regression analysis based on relevance[J].Knowledge-Based Systems,2019,173(6):140-149.
[15]CHAUDHURI K,MONTELEONI C,SARWATE D.Differentially private empirical risk minimization[J].The Journal of Machine Learning Research,2011,12(2):1069-1109.
[16]ZHANG J,ZHANG Z J,XIAO X K,et al.Functional mechanism:regression analysis under differential privacy[J].Procee-dings of the VLDB Endowment,2012,5(11):1364-1375.
[17]DWORK C,KENTHAPADI K,MCSHERRY F,et al.Ourdata,ourselves:privacy via distributed noise generation[C]//Proceedings of the 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques.Berlin:Springer,2006:486-503.
[18]JAYARAMAN B,EVANS D.Evaluating differentially private machine learning in practice[C]//Proceedings of the 28th USENIX Security Symposium.Berkeley,CA:USENIX,2019:1895-1912.
[19]LIU J X,MENG X F.Survey on Privacy-Preserving MachineLearning[J].Journal of Computer Research and Development,2020,57(2):346-362.
[20]DWORK C,ROTH A.The algorithmic foundations of differential privacy[J].Foundations and Trends in Theoretical Compu-ter Science,2014,9(3/4):211-407.
[21]MCSHERRY F,TALWAR K.Mechanism design via differential privacy[C]//Proceedings of the 48th Annual IEEE Symposium on Foundations of Computer Science.Piscataway,NJ:IEEE,2007:94-103.
[22]XIONG P,ZHU T Q,WANG X F.A survey on differential privacy and applications[J].Chinese Journal of Computers,2014,37(1):101-122.
[23]MCSHERRY F.Privacy integrated queries:an extensible plat-form for privacy-preserving data analysis[J].Communications of the ACM,2010,53(9):89-97.
[24]PEARSON K.On lines and planes of closest fit to systems of points in space[J].Philosophical Magazine,1901,2(11):559-572.
[25]DWORK C,TALWAR K,THAKURTA A,et al.Analyzegauss:optimal bounds for privacy-preserving principal component analysis[C]//Proceedings of the 46th Annual ACM Symposium on Theory of Computing.New York:ACM,2014:11-20.
[26]PENG C G,ZHAO Y Y,FAN M M.A differential private data publishing algorithm via principal component analysis based on maximum information coefficient[J].Netinfo Security,2020,20(2):37-48.
[1] LI Rongchang, ZHENG Haibin, ZHAO Wenhong, CHEN Jinyin. Data Reconstruction Attack for Vertical Graph Federated Learning [J]. Computer Science, 2023, 50(7): 332-338.
[2] ZHANG Lianfu, TAN Zuowen. Robust Federated Learning Algorithm Based on Adaptive Weighting [J]. Computer Science, 2023, 50(6A): 230200188-9.
[3] ZHAO Yuqi, YANG Min. Review of Differential Privacy Research [J]. Computer Science, 2023, 50(4): 265-276.
[4] LIU Likang, ZHOU Chunlai. RCP:Mean Value Protection Technology Under Local Differential Privacy [J]. Computer Science, 2023, 50(2): 333-345.
[5] TANG Ling-tao, WANG Di, ZHANG Lu-fei, LIU Sheng-yun. Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy [J]. Computer Science, 2022, 49(9): 297-305.
[6] LYU You, WU Wen-yuan. Privacy-preserving Linear Regression Scheme and Its Application [J]. Computer Science, 2022, 49(9): 318-325.
[7] LI Qi-ye, XING Hong-jie. KPCA Based Novelty Detection Method Using Maximum Correntropy Criterion [J]. Computer Science, 2022, 49(8): 267-272.
[8] HUANG Jue, ZHOU Chun-lai. Frequency Feature Extraction Based on Localized Differential Privacy [J]. Computer Science, 2022, 49(7): 350-356.
[9] QUE Hua-kun, FENG Xiao-feng, LIU Pan-long, GUO Wen-chong, LI Jian, ZENG Wei-liang, FAN Jing-min. Application of Grassberger Entropy Random Forest to Power-stealing Behavior Detection [J]. Computer Science, 2022, 49(6A): 790-794.
[10] WANG Mei-shan, YAO Lan, GAO Fu-xiang, XU Jun-can. Study on Differential Privacy Protection for Medical Set-Valued Data [J]. Computer Science, 2022, 49(4): 362-368.
[11] KONG Yu-ting, TAN Fu-xiang, ZHAO Xin, ZHANG Zheng-hang, BAI Lu, QIAN Yu-rong. Review of K-means Algorithm Optimization Based on Differential Privacy [J]. Computer Science, 2022, 49(2): 162-173.
[12] QU Xiang-mou, WU Ying-bo, JIANG Xiao-ling. Federated Data Augmentation Algorithm for Non-independent and Identical Distributed Data [J]. Computer Science, 2022, 49(12): 33-39.
[13] GAO Ji-hang, ZHANG Yan. Fault Diagnosis of Shipboard Zonal Distribution Power System Based on FWA-PSO-MSVM [J]. Computer Science, 2022, 49(11A): 210800209-5.
[14] LIU Fang-zheng, MA Bo-wen, LYU Bo-feng, HUANG Ji-wei. UAV Base Station Deployment Method for Mobile Edge Computing [J]. Computer Science, 2022, 49(11A): 220200089-7.
[15] SHI Kun, ZHOU Yong, ZHANG Qi-liang, JIANG Shun-rong. Privacy-preserving Scheme of Energy Trading Data Based on Consortium Blockchain [J]. Computer Science, 2022, 49(11): 335-344.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!