Computer Science ›› 2024, Vol. 51 ›› Issue (3): 30-38.doi: 10.11896/jsjkx.230700177

• Information Security Protection in New Computing Mode • Previous Articles     Next Articles

Differential Privacy Data Synthesis Method Based on Latent Diffusion Model

GE Yinchi, ZHANG Hui, SUN Haohang   

  1. State Key Laboratory of Complex &Critical Software Environment,Beihang University,Beijing 100191,China
  • Received:2023-07-24 Revised:2023-12-08 Online:2024-03-15 Published:2024-03-13
  • About author:GE Yinchi,born in 1996,Ph.D,is a student member of CCF(No.P1106G).His main research interests include data security,privacy-preserving computing,and artificial intelligence.ZHANG Hui,born in 1968,professor,Ph.D supervisor.His main research interests include big data management and mining,data security,and block chain

Abstract: The widespread application of data sharing and publication in the socio-economic domain drives scientific progress and societal development.However,issues related to copyright and privacy,especially concerning personal data,remain critical challenges.Differential privacy data synthesis has emerged as an effective means of protecting data privacy,where data holders can release synthetic data instead of real data,thereby enhancing data utility and availability while preserving privacy.In response to the limited usability of existing differential privacy generation models,this paper proposes a two-stage differential privacy generation model based on the latent space diffusion approach.Firstly,the differential privacy-aware information compression is performed on the original image,and it is projected from the pixel space to the latent space to obtain the desensitized latent vector representation of the original sensitive data.The latent vector is then fed into a diffusion model to gradually transform into a prior distribution and sampled through a denoising process.Experimental results based on the MNIST and Fashion MNIST datasets demonstrate that the proposed model exhibits significant improvements in terms of Fréchet inception distance(FID) and downstream task accuracy compared to state-of-the-art models like DP-Sinkhorn.

Key words: Differential privacy, Data synthesis, Generative models, Autoencoder, Diffusion models

CLC Number: 

  • TP183
[1]ARMANIOUS K,JIANG C,FISCHER M,et al.MedGAN:Medical image translation using GANs[J].Computerized Medical Imaging and Graphics,2020,79:101684.
[2]HU H,SALCIC Z,SUNL,et al.Membership inference attacks on machine learning:A survey[J].ACM Computing Surveys(CSUR),2022,54(11s):1-37.
[3]SUN H,ZHU T,ZHANG Z,et al.Adversarial attacks against deep generative models on data:a survey[J].IEEE Transactions on Knowledge and Data Engineering,2021,35(4):3367-3388.
[4]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[5]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[6]HO J,JAIN A,ABBEEL P.Denoising diffusion probabilisticmodels[J].Advances in Neural Information Processing Systems,2020,33:6840-6851.
[7]ABADI M,CHU A,GOODFELLOW I,et al.Deep learning with differential privacy[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.2016:308-318.
[8]DWORK C.Differential privacy[C]//Automata,Languages and Programming:33rd International Colloquium(ICALP 2006).Springer Berlin Heidelberg,2006:1-12.
[9]SWEENEY L.k-anonymity:A model for protecting privacy[J].International Journal of Uncertainty,Fuzziness and Knowledge-based Systems,2002,10(5):557-570.
[10]MIRONOV I.Rényi differential privacy[C]//2017 IEEE 30th Computer Security Foundations Symposium(CSF).IEEE,2017:263-275.
[11]DONG J,ROTH A,SU W J.Gaussian differential privacy[J].Journal of the Royal Statistical Society Series B:Statistical Methodology,2022,84(1):3-37.
[12]GOPI S,LEE Y T,WUTSCHITZ L.Numerical composition of differential privacy[J].Advances in Neural Information Proces-sing Systems,2021,34:11631-11642.
[13]CHEN Q,XIANG C,XUE M,et al.Differentially private data generative models[J].arXiv:1812.02274,2018.
[14]HARDER F,ADAMCZEWSKI K,PARK M.Dp-merf:Differen-tially private mean embeddings with randomfeatures for practical privacy-preserving data generation[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2021:1819-1827.
[15]HARDER F,JALALI M,SUTHERLAND D J,et al.Pre-trained perceptual features improve differentially private image generation[C]//TMLR.2023.
[16]VINAROZ M,CHARUSAIE M A,HARDER F,et al.Hermite polynomial features for private data generation[C]//International Conference on Machine Learning.PMLR,2022:22300-22324.
[17]CAO T,BIE A,VAHDAT A,et al.Don't generate me:Training differentially private generative models with sinkhorn divergence[J].Advances in Neural Information Processing Systems,2021,34:12480-12492.
[18]XIE L,LIN K,WANG S,et al.Differentially private generative adversarial network[J].arXiv:1802.06739,2018.
[19]TORKZADEHMAHANI R,KAIROUZ P,PATEN B.Dp-cgan:Differentially private synthetic data and label generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019.
[20]JORDON J,YOON J,VAN DER SCHAAR M.PATE-GAN:Generating synthetic data with differential privacy guarantees[C]//International Conference on Learning Representations.2019.
[21]PAPERNOT N,ABADI M,ERLINGSSON Ú,et al.Semi-su-pervised knowledge transfer for deep learning from private training data [C]//International Conference on Learning Representations.2016.
[22]LONG Y,WANG B,YANG Z,et al.G-PATE:scalable differentially private data generator via private aggregation of teacher discriminators[J].Advances in Neural Information Processing Systems,2021,34:2965-2977.
[23]CHEN D,OREKONDY T,FRITZ M.Gs-wgan:A gradient-sani-tized approach for learning differentially private generators[J].Advances in Neural Information Processing Systems,2020,33:12673-12684.
[24]WANG B,WU F,LONG Y,et al.Datalens:Scalable privacypreserving training via gradient compression and aggregation[C]//Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security.2021:2146-2168.
[25]ROMBACH R,BLATTMANN A,LORENZ D,et al.High-re-solution image synthesis with latent diffusion models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:10684-10695.
[26]VAN DEN OORD A,KALCHBRENNER N,ESPEHOLT L,et al.Conditional image generation with pixelcnn decoders[C]//Proceedings of the 30th International Conference on Neural Information Processing Systems.2016:4797-4805.
[1] YOU Feifu, CAI Jianping, SUN Lan. Census Associated Multiple Attributes Data Release Based on Differential Privacy [J]. Computer Science, 2024, 51(3): 368-377.
[2] CAI Mengnan, SHEN Guohua, HUANG Zhiqiu, YANG Yang. High-dimensional Data Publication Under Local Differential Privacy [J]. Computer Science, 2024, 51(2): 322-332.
[3] YAN Zhihao, ZHOU Zhangbing, LI Xiaocui. Survey on Generative Diffusion Model [J]. Computer Science, 2024, 51(1): 273-283.
[4] WANG Zhousheng, YANG Geng, DAI Hua. Lightweight Differential Privacy Federated Learning Based on Gradient Dropout [J]. Computer Science, 2024, 51(1): 345-354.
[5] LI Qiaojun, ZHANG Wen, YANG Wei. Fusion Neural Network-based Method for Predicting LncRNA-disease Association [J]. Computer Science, 2023, 50(8): 226-232.
[6] LI Kejia, HU Xuexian, CHEN Yue, YANG Hongjian, XU Yang, LIU Yang. Differential Privacy Linear Regression Algorithm Based on Principal Component Analysis andFunctional Mechanism [J]. Computer Science, 2023, 50(8): 342-351.
[7] LI Hui, LI Wengen, GUAN Jihong. Dually Encoded Semi-supervised Anomaly Detection [J]. Computer Science, 2023, 50(7): 53-59.
[8] LI Rongchang, ZHENG Haibin, ZHAO Wenhong, CHEN Jinyin. Data Reconstruction Attack for Vertical Graph Federated Learning [J]. Computer Science, 2023, 50(7): 332-338.
[9] ZHANG Lianfu, TAN Zuowen. Robust Federated Learning Algorithm Based on Adaptive Weighting [J]. Computer Science, 2023, 50(6A): 230200188-9.
[10] ZHAO Yuqi, YANG Min. Review of Differential Privacy Research [J]. Computer Science, 2023, 50(4): 265-276.
[11] RAO Dan, SHI Hongwei. Study on Air Traffic Flow Recognition and Anomaly Detection Based on Deep Clustering [J]. Computer Science, 2023, 50(3): 121-128.
[12] LIU Likang, ZHOU Chunlai. RCP:Mean Value Protection Technology Under Local Differential Privacy [J]. Computer Science, 2023, 50(2): 333-345.
[13] LI Shasha, XING Hongjie, LI Gang. Multi-temporal Hyperspectral Anomaly Change Detection Based on Dual Space Conjugate Autoencoder [J]. Computer Science, 2023, 50(12): 175-184.
[14] KONG Fengling, WU Hao, DONG Qingqing. Self-optimized Single Cell Clustering Using ZINB Model and Graph Attention Autoencoder [J]. Computer Science, 2023, 50(12): 104-112.
[15] YIN Shiyu, ZHU Youwen, ZHANG Yue. Utility-optimized Local Differential Privacy Joint Distribution Estimation Mechanisms [J]. Computer Science, 2023, 50(10): 315-326.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!