计算机科学 ›› 2021, Vol. 48 ›› Issue (7): 33-39.doi: 10.11896/jsjkx.201200224

所属专题: 人工智能安全

• 人工智能安全* • 上一篇    下一篇

基于特征映射的差分隐私保护机器学习方法

陈天荣, 凌捷   

  1. 广东工业大学计算机学院 广东510006
  • 收稿日期:2020-12-25 修回日期:2021-02-20 出版日期:2021-07-15 发布日期:2021-07-02
  • 通讯作者: 凌捷(1150181103@qq.com)
  • 基金资助:
    广东省重点领域研发计划项目(2019B010139002);广州市重点领域研发计划项目(202007010004)

Differential Privacy Protection Machine Learning Method Based on Features Mapping

CHEN Tian-rong, LING Jie   

  1. School of Computer,Guangdong University of Technology,Guangdong 510006,China
  • Received:2020-12-25 Revised:2021-02-20 Online:2021-07-15 Published:2021-07-02
  • About author:CHEN Tian-rong,born in 1996,postgraduate.His main research interests include digital image processing and privacy protection.(1181113557@qq.com)
    LING Jie,born in 1964,Ph.D,professor.His main research interests include information security technology and intelligent video processing technology.
  • Supported by:
    Key Field R&D projects in Guangdong Province of China(2019B010139002) and Key Field R&D projects in Guangzhou(202007010004).

摘要: 图像分类中的差分隐私算法在通过添加噪声的方式提高机器学习模型的隐私保护能力的同时,容易造成模型分类准确度的下降。针对以上问题,提出了一种基于特征映射的差分隐私保护机器学习方法,该方法结合预训练神经网络和影子模型训练技术,以差分向量的形式将原数据样本的特征向量映射到高维向量空间,缩短样本在高维向量空间的距离,以减小模型更新造成的隐私信息泄露风险,同时提高机器学习模型的隐私保护能力和分类能力。由MNIST和CIFAR-10数据集上的实验结果表明,ε分别等于0.01和0.11的ε0-差分隐私的模型的分类准确度分别提高到了99%和96%,说明所提方法训练的模型相比DP-SGD等现有多种常用差分隐私算法,能在更低隐私预算下保持更强的分类能力;且在两个数据集上针对该模型的推理攻击成功率降低为10%,其对推理攻击的防御能力相比传统图像分类的CNN模型有较大幅度的提升。

关键词: 差分隐私, 机器学习, 图像分类, 推理攻击, 影子模型

Abstract: The differential privacy algorithm in image classification improves the privacy protection capability of the machine learning model by adding noise,and at the same time easily causes the accuracy of the model classification to decrease.To solve the above problems,a differential privacy protection machine learning method based on features mapping is proposed.Thismethodcombines the pre-training neural network and shadow model training technology to map the feature vectors of the original data sample to the high-dimensional vector space in the form of differential vectors,so as to shorten the distance of the sample in the high-dimensional vector space to reduce the leakage of private information caused by model updates,and improve the privacy protection and classification capabilities of the machine learning model.The experimental results on the MNIST and CIFAR-10 datasets show that for the ε-differential privacy model with ε equal to 0.01 and 0.11,the classification accuracy is improved to 99% and 96%,respectively,indicating that compared with DP-SGD and many other commonly used differential privacy algorithms,the model trained by this method can maintain stronger classification capabilities at a lower privacy budget.And the success rate of reasoning attacks against this model on the two data sets is reduced to 10%,which is against inference attacks.Compared with the traditional CNN model of image classification,the defense capability of the CNN model is greatly improved.

Key words: Differential privacy, Image classification, Inference attack, Machine learning, Shadow model

中图分类号: 

  • TP391
[1]HA T,DANG T K,DANG T T,et al.Differential Privacy inDeep Learning:An Overview[C]//2019 International Confe-rence on Advanced Computing and Applications (ACOMP).Piscataway,NJ,USA:IEEE,2019:97-102.
[2]AHMED S,APRATIM B,MICHEAL B,et al.Updates-Leak:Data Set Inference and Reconstruction Attacks in Online Lear-ning[C]//29th USENIX Security Symposium.Online:USENIX Association,2019:1291-1308.
[3]SHOKRI R,STROATI M,SONG C Z,et al.Membership Infe-rence Attacks Against Machine Learning Models[C]//2017 38th IEEE Symposium on Security and Privacy (SP).Los Alamitos,CA,USA:IEEE Computer Society,2017:3-18.
[4]DWORK C,KENTHAPADI K,MCSHERRY F,et al.Our data,ourselves:privacy via distributed noise generation[C]//24th Annual International Conference on the Theory and Applications of Cryptographic Techniques Advances in Cryptology(EUROCRYPT 2006).Berlin,Germany:IEEE Computer Socie-ty,2006:486-503.
[5]ABADI M,MCMAHANH B,CHU A,et al.Deep learning with differential privacy[C]//Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security(CCS 2016).Vienna,Austria:Association for Computing Machinery,2016:308-318.
[6]XIE L Y,LIN K X,WANG S,et al.Differentially Private Gene-rative Adversarial Network[J/OL].http://arxiv.org/abs/1802.06739,2020-5-13.
[7]PHAN N,WANG Y,WU X,et al.Differential privacy preservation for deep auto-encoders:An application of human behavior prediction[C]//30th AAAI Conference on Artificial Intelligence(AAAI 2016).Phoenix,AZ,United states:AAAI press,2016:1309-1316.
[8]PHAN N,WU X,HU H,et al.Adaptive Laplace mechanism:differential privacy preservation in deep learning[C]//2017 IEEE International Conference on Data Mining (ICDM).Los Alamitos,CA,USA:IEEE Computer Society,2017:385-394.
[9]PAPERNOT N,GOODFELLOW I,ABADI M,et al.Semi-supervised knowledge transfer for deep learning from private training data[C]//5th International Conference on Learning Representations(ICLR 2017).Conference Track Proceedings.Toulon,France:ICLR,2017:1024-1040.
[10]GANJU K,WANG Q,YANG W,et al.Property inference attacks on fully connected neural networks using permutation invariant representations[C]//Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security(CCS 2018).United States:Association for Computing Machi-nery,2018:619-633.
[11]JOON O S,BERNT S,MARIO F.Towards Reverse-Enginee-ring Black-Box Neural Networks[J].Springer Verlag,2017,11700(2017):121-144.
[12]SALEM A,YANG Z,HUMBERT M,et al.ML-Leaks:Model and Data Independent Membership Inference Attacks and Defenses on Machine Learning Models[J/OL].http://arxiv.org/abs/1806.01246,2018-12-14.
[13]SHOKRI R,STRONATI M,SONG C,et al.Membership Infe-rence Attacks Against Machine Learning Models[C]//2017 38th IEEE Symposium on Security and Privacy (SP).Los Alamitos,CA,USA:IEEE Computer Society,2017:3-18.
[14]WANG B,GONG N.Stealing Hyperparameters in MachineLearning[C]//2018 IEEE Symposium on Security and Privacy (SP).Los Alamitos,CA,USA:IEEE Computer Society,2018:36-52.
[15]PHAN N,WU X,DOU D.Preserving differential privacy in convolutional deep belief networks[J].MACH LEARN,2017,106:1681-1704.
[16]GONG M,PAN K,XIE Y,et al.Preserving differential privacy in deep neural networks with relevance-based adaptive noise imposition[J].Neural Networks,2020,125:131-141.
[17]DONG J S,ROTH A,SU W J,et al.Gaussian Differential Privacy [J/OL].http://arxiv.org/abs/1905.02383,2019-10-08.
[1] 冷典典, 杜鹏, 陈建廷, 向阳.
面向自动化集装箱码头的AGV行驶时间估计
Automated Container Terminal Oriented Travel Time Estimation of AGV
计算机科学, 2022, 49(9): 208-214. https://doi.org/10.11896/jsjkx.210700028
[2] 宁晗阳, 马苗, 杨波, 刘士昌.
密码学智能化研究进展与分析
Research Progress and Analysis on Intelligent Cryptology
计算机科学, 2022, 49(9): 288-296. https://doi.org/10.11896/jsjkx.220300053
[3] 汤凌韬, 王迪, 张鲁飞, 刘盛云.
基于安全多方计算和差分隐私的联邦学习方案
Federated Learning Scheme Based on Secure Multi-party Computation and Differential Privacy
计算机科学, 2022, 49(9): 297-305. https://doi.org/10.11896/jsjkx.210800108
[4] 何强, 尹震宇, 黄敏, 王兴伟, 王源田, 崔硕, 赵勇.
基于大数据的进化网络影响力分析研究综述
Survey of Influence Analysis of Evolutionary Network Based on Big Data
计算机科学, 2022, 49(8): 1-11. https://doi.org/10.11896/jsjkx.210700240
[5] 武红鑫, 韩萌, 陈志强, 张喜龙, 李慕航.
监督和半监督学习下的多标签分类综述
Survey of Multi-label Classification Based on Supervised and Semi-supervised Learning
计算机科学, 2022, 49(8): 12-25. https://doi.org/10.11896/jsjkx.210700111
[6] 李瑶, 李涛, 李埼钒, 梁家瑞, Ibegbu Nnamdi JULIAN, 陈俊杰, 郭浩.
基于多尺度的稀疏脑功能超网络构建及多特征融合分类研究
Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network
计算机科学, 2022, 49(8): 257-266. https://doi.org/10.11896/jsjkx.210600094
[7] 张光华, 高天娇, 陈振国, 于乃文.
基于N-Gram静态分析技术的恶意软件分类研究
Study on Malware Classification Based on N-Gram Static Analysis Technology
计算机科学, 2022, 49(8): 336-343. https://doi.org/10.11896/jsjkx.210900203
[8] 陈明鑫, 张钧波, 李天瑞.
联邦学习攻防研究综述
Survey on Attacks and Defenses in Federated Learning
计算机科学, 2022, 49(7): 310-323. https://doi.org/10.11896/jsjkx.211000079
[9] 黄觉, 周春来.
基于本地化差分隐私的频率特征提取
Frequency Feature Extraction Based on Localized Differential Privacy
计算机科学, 2022, 49(7): 350-356. https://doi.org/10.11896/jsjkx.210900229
[10] 杨健楠, 张帆.
一种结合双注意力机制和层次网络结构的细碎农作物分类方法
Classification Method for Small Crops Combining Dual Attention Mechanisms and Hierarchical Network Structure
计算机科学, 2022, 49(6A): 353-357. https://doi.org/10.11896/jsjkx.210200169
[11] 杜丽君, 唐玺璐, 周娇, 陈玉兰, 程建.
基于注意力机制和多任务学习的阿尔茨海默症分类
Alzheimer's Disease Classification Method Based on Attention Mechanism and Multi-task Learning
计算机科学, 2022, 49(6A): 60-65. https://doi.org/10.11896/jsjkx.201200072
[12] 李亚茹, 张宇来, 王佳晨.
面向超参数估计的贝叶斯优化方法综述
Survey on Bayesian Optimization Methods for Hyper-parameter Tuning
计算机科学, 2022, 49(6A): 86-92. https://doi.org/10.11896/jsjkx.210300208
[13] 赵璐, 袁立明, 郝琨.
多示例学习算法综述
Review of Multi-instance Learning Algorithms
计算机科学, 2022, 49(6A): 93-99. https://doi.org/10.11896/jsjkx.210500047
[14] 王飞, 黄涛, 杨晔.
基于Stacking多模型融合的IGBT器件寿命的机器学习预测算法研究
Study on Machine Learning Algorithms for Life Prediction of IGBT Devices Based on Stacking Multi-model Fusion
计算机科学, 2022, 49(6A): 784-789. https://doi.org/10.11896/jsjkx.210400030
[15] 肖治鸿, 韩晔彤, 邹永攀.
基于多源数据和逻辑推理的行为识别技术研究
Study on Activity Recognition Based on Multi-source Data and Logical Reasoning
计算机科学, 2022, 49(6A): 397-406. https://doi.org/10.11896/jsjkx.210300270
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!