基于隐空间扩散模型的差分隐私数据合成方法研究

doi:10.11896/jsjkx.230700177

Abstract

Abstract: The widespread application of data sharing and publication in the socio-economic domain drives scientific progress and societal development.However,issues related to copyright and privacy,especially concerning personal data,remain critical challenges.Differential privacy data synthesis has emerged as an effective means of protecting data privacy,where data holders can release synthetic data instead of real data,thereby enhancing data utility and availability while preserving privacy.In response to the limited usability of existing differential privacy generation models,this paper proposes a two-stage differential privacy generation model based on the latent space diffusion approach.Firstly,the differential privacy-aware information compression is performed on the original image,and it is projected from the pixel space to the latent space to obtain the desensitized latent vector representation of the original sensitive data.The latent vector is then fed into a diffusion model to gradually transform into a prior distribution and sampled through a denoising process.Experimental results based on the MNIST and Fashion MNIST datasets demonstrate that the proposed model exhibits significant improvements in terms of Fréchet inception distance(FID) and downstream task accuracy compared to state-of-the-art models like DP-Sinkhorn.

Key words: Differential privacy, Data synthesis, Generative models, Autoencoder, Diffusion models

CLC Number:

TP183

GE Yinchi, ZHANG Hui, SUN Haohang. Differential Privacy Data Synthesis Method Based on Latent Diffusion Model[J].Computer Science, 2024, 51(3): 30-38.

References

[1]ARMANIOUS K,JIANG C,FISCHER M,et al.MedGAN:Medical image translation using GANs[J].Computerized Medical Imaging and Graphics,2020,79:101684.
[2]HU H,SALCIC Z,SUNL,et al.Membership inference attacks on machine learning:A survey[J].ACM Computing Surveys(CSUR),2022,54(11s):1-37.
[3]SUN H,ZHU T,ZHANG Z,et al.Adversarial attacks against deep generative models on data:a survey[J].IEEE Transactions on Knowledge and Data Engineering,2021,35(4):3367-3388.
[4]KINGMA D P,WELLING M.Auto-encoding variational bayes[J].arXiv:1312.6114,2013.
[5]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial networks[J].Communications of the ACM,2020,63(11):139-144.
[6]HO J,JAIN A,ABBEEL P.Denoising diffusion probabilisticmodels[J].Advances in Neural Information Processing Systems,2020,33:6840-6851.
[7]ABADI M,CHU A,GOODFELLOW I,et al.Deep learning with differential privacy[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security.2016:308-318.
[8]DWORK C.Differential privacy[C]//Automata,Languages and Programming:33rd International Colloquium(ICALP 2006).Springer Berlin Heidelberg,2006:1-12.
[9]SWEENEY L.k-anonymity:A model for protecting privacy[J].International Journal of Uncertainty,Fuzziness and Knowledge-based Systems,2002,10(5):557-570.
[10]MIRONOV I.Rényi differential privacy[C]//2017 IEEE 30th Computer Security Foundations Symposium(CSF).IEEE,2017:263-275.
[11]DONG J,ROTH A,SU W J.Gaussian differential privacy[J].Journal of the Royal Statistical Society Series B:Statistical Methodology,2022,84(1):3-37.
[12]GOPI S,LEE Y T,WUTSCHITZ L.Numerical composition of differential privacy[J].Advances in Neural Information Proces-sing Systems,2021,34:11631-11642.
[13]CHEN Q,XIANG C,XUE M,et al.Differentially private data generative models[J].arXiv:1812.02274,2018.
[14]HARDER F,ADAMCZEWSKI K,PARK M.Dp-merf:Differen-tially private mean embeddings with randomfeatures for practical privacy-preserving data generation[C]//International Conference on Artificial Intelligence and Statistics.PMLR,2021:1819-1827.
[15]HARDER F,JALALI M,SUTHERLAND D J,et al.Pre-trained perceptual features improve differentially private image generation[C]//TMLR.2023.
[16]VINAROZ M,CHARUSAIE M A,HARDER F,et al.Hermite polynomial features for private data generation[C]//International Conference on Machine Learning.PMLR,2022:22300-22324.
[17]CAO T,BIE A,VAHDAT A,et al.Don't generate me:Training differentially private generative models with sinkhorn divergence[J].Advances in Neural Information Processing Systems,2021,34:12480-12492.
[18]XIE L,LIN K,WANG S,et al.Differentially private generative adversarial network[J].arXiv:1802.06739,2018.
[19]TORKZADEHMAHANI R,KAIROUZ P,PATEN B.Dp-cgan:Differentially private synthetic data and label generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.2019.
[20]JORDON J,YOON J,VAN DER SCHAAR M.PATE-GAN:Generating synthetic data with differential privacy guarantees[C]//International Conference on Learning Representations.2019.
[21]PAPERNOT N,ABADI M,ERLINGSSON Ú,et al.Semi-su-pervised knowledge transfer for deep learning from private training data [C]//International Conference on Learning Representations.2016.
[22]LONG Y,WANG B,YANG Z,et al.G-PATE:scalable differentially private data generator via private aggregation of teacher discriminators[J].Advances in Neural Information Processing Systems,2021,34:2965-2977.
[23]CHEN D,OREKONDY T,FRITZ M.Gs-wgan:A gradient-sani-tized approach for learning differentially private generators[J].Advances in Neural Information Processing Systems,2020,33:12673-12684.
[24]WANG B,WU F,LONG Y,et al.Datalens:Scalable privacypreserving training via gradient compression and aggregation[C]//Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security.2021:2146-2168.
[25]ROMBACH R,BLATTMANN A,LORENZ D,et al.High-re-solution image synthesis with latent diffusion models[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.2022:10684-10695.
[26]VAN DEN OORD A,KALCHBRENNER N,ESPEHOLT L,et al.Conditional image generation with pixelcnn decoders[C]//Proceedings of the 30^th International Conference on Neural Information Processing Systems.2016:4797-4805.

Related Articles 15

[1]	YOU Feifu, CAI Jianping, SUN Lan. Census Associated Multiple Attributes Data Release Based on Differential Privacy [J]. Computer Science, 2024, 51(3): 368-377.
[2]	CAI Mengnan, SHEN Guohua, HUANG Zhiqiu, YANG Yang. High-dimensional Data Publication Under Local Differential Privacy [J]. Computer Science, 2024, 51(2): 322-332.
[3]	YAN Zhihao, ZHOU Zhangbing, LI Xiaocui. Survey on Generative Diffusion Model [J]. Computer Science, 2024, 51(1): 273-283.
[4]	WANG Zhousheng, YANG Geng, DAI Hua. Lightweight Differential Privacy Federated Learning Based on Gradient Dropout [J]. Computer Science, 2024, 51(1): 345-354.
[5]	LI Qiaojun, ZHANG Wen, YANG Wei. Fusion Neural Network-based Method for Predicting LncRNA-disease Association [J]. Computer Science, 2023, 50(8): 226-232.
[6]	LI Kejia, HU Xuexian, CHEN Yue, YANG Hongjian, XU Yang, LIU Yang. Differential Privacy Linear Regression Algorithm Based on Principal Component Analysis andFunctional Mechanism [J]. Computer Science, 2023, 50(8): 342-351.
[7]	LI Hui, LI Wengen, GUAN Jihong. Dually Encoded Semi-supervised Anomaly Detection [J]. Computer Science, 2023, 50(7): 53-59.
[8]	LI Rongchang, ZHENG Haibin, ZHAO Wenhong, CHEN Jinyin. Data Reconstruction Attack for Vertical Graph Federated Learning [J]. Computer Science, 2023, 50(7): 332-338.
[9]	ZHANG Lianfu, TAN Zuowen. Robust Federated Learning Algorithm Based on Adaptive Weighting [J]. Computer Science, 2023, 50(6A): 230200188-9.
[10]	ZHAO Yuqi, YANG Min. Review of Differential Privacy Research [J]. Computer Science, 2023, 50(4): 265-276.
[11]	RAO Dan, SHI Hongwei. Study on Air Traffic Flow Recognition and Anomaly Detection Based on Deep Clustering [J]. Computer Science, 2023, 50(3): 121-128.
[12]	LIU Likang, ZHOU Chunlai. RCP:Mean Value Protection Technology Under Local Differential Privacy [J]. Computer Science, 2023, 50(2): 333-345.
[13]	LI Shasha, XING Hongjie, LI Gang. Multi-temporal Hyperspectral Anomaly Change Detection Based on Dual Space Conjugate Autoencoder [J]. Computer Science, 2023, 50(12): 175-184.
[14]	KONG Fengling, WU Hao, DONG Qingqing. Self-optimized Single Cell Clustering Using ZINB Model and Graph Attention Autoencoder [J]. Computer Science, 2023, 50(12): 104-112.
[15]	YIN Shiyu, ZHU Youwen, ZHANG Yue. Utility-optimized Local Differential Privacy Joint Distribution Estimation Mechanisms [J]. Computer Science, 2023, 50(10): 315-326.

Metrics

Viewed

Full text

Abstract

Cited

Shared

Discussed

Comments

Recommended 0

No Suggested Reading articles found!

Differential Privacy Data Synthesis Method Based on Latent Diffusion Model

PDF (PC)

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 0