计算机科学 ›› 2023, Vol. 50 ›› Issue (10): 315-326.doi: 10.11896/jsjkx.221000053
尹诗玉, 朱友文, 张跃
YIN Shiyu, ZHU Youwen, ZHANG Yue
摘要: 相对于传统的中心化差分隐私,本地差分隐私(Local Differential Privacy,LDP) 具有不依赖可信第三方等优势,但也存在数据效用较低的问题。效用优化本地差分隐私模型ULDP(Utility-Optimized Local Differential Privacy) 利用不同输入的敏感度差异,可以提升估计结果的准确度。二维数据联合分布计算可广泛应用于众多数据分析场景,然而,如何在ULDP模型下实现二维数据联合分布估计,仍然是尚未解决的重要问题。针对这一问题,首先给出了二维ULDP模型的定义,兼顾了两个属性分别敏感与否的4种情况;其次,在该模型下,针对联合分布估计问题,提出了JuRR(Joint Utility-Optimized Randomized Response) 与CPRR(Cartesian Product Randomized Response) 2种机制,并理论证明了估计结果的无偏性;最后,在真实数据集上进行对比实验,讨论了不同参数对估计误差的影响。实验结果表明,所提2种机制具有更高的数据效用。
中图分类号:
[1]SWEENEY L.k-anonymity:A Model for Protecting Privacy[J].International Journal of Uncertainty,Fuzziness and Know-ledge-Based Systems,2002,10(5):557-570. [2]MACHANAVAJJHALA A,KIFER D,GEHRKE J,et al.l-diversity:Privacy beyond k-anonymity[C]//22nd International Conference on Data Engineering.2006. [3]LI N,LI T,VENKATASUBRAMANIAN S.t-closeness:Privacy beyond k-anonymity and l-diversity[C]//IEEE 23rd International Conference on Data Engineering.Istanbul:IEEE Press,2007:106-115. [4]YAO A C.Protocols for Secure Computations[C]//23rd An-nual Symposium on Foundations of Computer Science.Chicago:IEEE Press,1982:160-164. [5]DWORK C.Differential privacy:A survey of results[C]//International Conference on Theory and Applications of Models of Computation.Berlin,Heidelberg:Springer Press,2008:1-19. [6]KASIVISWANATHAN S P,LEE H K,NISSIM K,et al.What Can We Learn Privately?[J].SIAM Journal on Computing,2011,40(3):793-826. [7]DUCHI J C,JORDAN M I,WAINWRIGHT M J.Local Privacy and Statistical Minimax Rates[C]//IEEE 54th Annual Symposium on Foundations of Computer Science.Berkeley:IEEE Press,2013:429-438. [8]WANG T,BLOCKI J,LI N,et al.Locally Differentially Private Protocols for Frequency Estimation[C]//26th USENIX Security Symposium.Vancouver:USENIX Press,2017:729-745. [9]DUCHI J C,JORDAN M I,WAINWRIGHT M J.Minimax Optimal Procedures for Locally Private Estimation[J].Journal of the American Statistical Association,2018,113(521):182-201. [10]WANG N,XIAO X,YANG Y,et al.Collecting and Analyzing Multidimensional Data with Local Differential Privacy[C]//IEEE 35th Inter-national Conference on Data Engineering.Macao:IEEE Press,2019:638-649. [11]ERLINGSSON Ú,PIHUR V,KOROLOVA A.Rappor:Ran-domized Aggregatable Privacy-Preserving Ordinal Response[C]//Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security.Scottsdale:ACM Press,2014:1054-1067. [12]TEAM A D P.Learning with Privacy at Scale[J].Apple Mach.Learn.J,2017,1(8):1-25. [13]SAMARATI P,SWEENEY L.Generalizing Data to Provide Anonymity when Disclosing Information[C]//PODS.Seattle:ACM Press,1998,98(188):1045-1145. [14]YILMAZ E,AL-RUBAIE M,CHANG J M.Naive Bayes Classification under Local Differential Privacy[C]//IEEE 7th International Conference on Data Science and Advanced Analytics.Adelaide:IEEE Press,2020:709-718. [15]MURAKAMI T,KAWAMOTO Y.Utility-Optimized Local Differential Privacy Mecha-nisms for Distribution Estimation[C]//28th USENIX Security Symposium.Santa Clara:USENIX Press,2019:1877-1894. [16]CORMODE G,MADDOCK S,MAPLE C.Frequency Estimation under Local Differential Privacy[J].Proceedings of the VLDB Endowment,2021,14(11):2046-2058. [17]WANG T,LI N,JHA S.Locally Differentially Private Frequent Itemset Mining[C]//2018 IEEE Symposium on Security and Privacy.San Francisco:IEEE Press,2018:127-143. [18]ZHU S X,WANG L,SUN G L.A Perturbation Mechanism for Classified Transformation Satisfying Local Differential Privacy [J].Journal of Computer Research and Development,2022,59(2):430-439. [19]ZHANG Z,WANG T,LI N,et al.Calm:Consistent Adaptive Local Marginal for Marginal Release under Local Differential Privacy[C]//Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security.New York:ACM Press,2018:212-229. [20]YE Q,HU H,MENG X,et al.PrivKV:Key-Value Data Collection with Local Differential Privacy[C]//2019 IEEE Symposium on Security and Privacy.San Francisco:IEEE Press,2019:317-331. [21]GU X,LI M,CHENG Y,et al.PCKV:Locally Differentially Private Correlated Key-Value Data Collection with Optimized Utility[C]//29th USENIX Security Symposium.Boston:USENIX Press,2020:967-984. [22]YE Q,HU H,MENG X,et al.PrivKVM*:Revisiting Key-Va-lue Statistics Estimation with Local Differential Privacy[J].IEEE Transactions on Dependable and Secure Computing,2021,14(8):1-18. [23]TIAN F,WU Z Q,LU L F,et al.A Sample Based Personalized Differential Privacy Mechanism for Trajectory Data Publication[J].Chinese Journal of Computers,2021,44(4):709-723. [24]BALLE B,BELL J,GASCÓN A,et al.The Privacy Blanket of The Shuffle Model[C]//Annual International Cryptology Conference.Cham:Springer Press,2019:638-667. [25]LIU Y F,WANG N,WANG Z G.Collecting and AnalyzingMultidimensional Categorical Data Under Shuffled Differential Privacy[J].Journal of Software,2022,33(3):1093-1110. [26]GU X,LI M,XIONG L,et al.Providing Input-Discriminative Protection for Local Differential Privacy[C]//IEEE 36th International Conference on Data Engineering.Dallas:IEEE Press,2020:505-516. [27]UCI MACHINE LEARNING REPOSITORY.Nursery Data Set [DB/OL].[2022-05-16].https://archive.ics.uci.edu/ml/datasets/Nursery. [28]KAGGLE.Performance in Exams [DB/OL].[2022-06-25].https://www.kaggle.com/datasets/spscientist/students-perfor-mance-in-exams. |
|