Computer Science ›› 2023, Vol. 50 ›› Issue (3): 351-359.doi: 10.11896/jsjkx.220100016

• Information Security • Previous Articles     Next Articles

Survey on Membership Inference Attacks Against Machine Learning

PENG Yuefeng1, ZHAO Bo1, LIU Hui1, AN Yang2   

  1. 1 School of Cyber Science and Engineering,Wuhan University,Wuhan 430000,China
    2 School of Computer Science,Wuhan University,Wuhan 430000,China
  • Received:2022-01-04 Revised:2022-03-27 Online:2023-03-15 Published:2023-03-15
  • About author:PENG Yuefeng,born in 1998,postgra-duate.His main research interests include artificial intelligence security and so on.
    ZHAO Bo,born in 1972,Ph.D,professor,Ph.D supervisor,is a senior member of China Computer Federation.His main research interests include trusted computing and trustworthy artificial intelligence.
  • Supported by:
    National Natural Science Foundation of China(U1936122).

Abstract: In recent years,machine learning has not only achieved remarkable results in conventional fields such as computer vision and natural language processing,but also been widely applied to process sensitive data such as face images,financial data and medical information.Recently,researchers find that machine learning models will remember the data in their training sets,making them vulnerable to membership inference attacks,that is,the attacker can infer whether the given data exists in the training set of a specific machine learning model.The success of membership inference attacks may lead to serious individual privacy leakage.For example,the existence of a patient's medical record in a hospital's analytical training set reveals that the patient was once a patient there.The paper first introduces the basic principle of membership inference attacks,and then systematically summarizes and classifies the representative research achievements on membership inference attacks and defenses in recent years.In particular,how to attack and defend under different conditions is described in detail.Finally,by reviewing the development of membership inference attacks,this paper explores the main challenges and potential development directions of machine learning privacy protection in the future.

Key words: Machine learning, Membership inference, Privacy leakage, Privacy protection

CLC Number: 

  • TP181
[1]WEYAND T,ARAUJO A,CAO B Y,et al.Google landmarks dataset v2-a large-scale benchmark for instance-level recognition and retrieval[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:2572-2581.
[2]HENAFF O.Data-efficient image recognition with contrastivepredictive coding[C]//2020 International Conference on Machine Learning(ICML).2020:4182-4192.
[3]BROWN T,MANN B,RYDER N,et al.Language models are few-shot learners[J].arXiv:2005.14165,2020.
[4]DEVLIN J,CHANG M W,LEE K,et al.BERT:Pre-trainingof deep bidirectional transformers for language understanding[C]//2019 Conference of the North American Chapter of the Association for Computational Linguistics:Human Language Technologies.2019:4171-418.
[5]DOUMBOUYA M,EINSTEIN L,PIECH C.Using radioarchives for low-resource speech recognition:Towards an intelligent virtual assistant for illiterate users[C]//2021 AAAI Conference on Artificial Intelligence(AAAI ).2021:14757-14765.
[6]LIU S,GENG M,HU S,et al.Recent progress in the cuhk dysarthric speech recognition system[J].IEEE/ACM Transactions on Audio,Speech,and Language Processing,2021,29:2267-2281.
[7]TAIGMAN Y,YANG M,RANZATO M,et al.Deepface:Clo-sing the gap to human-level performance in face verification[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition(CVPR).2014:1701-1708.
[8]SCHROFF F,KALENICHENKO D,PHILBIN J.Facenet:Aunified embedding for face recognition and clustering[C]//2015 IEEE Conference on Computer Vision and Patten Recognition(CVPR).2015:815-823.
[9]ERICKSON B J,KORFIATIS P,AKKUS Z,et al.Machinelearning for medical imaging[J].RadioGraphics,2017,37(2):505-515.
[10]KOUROU K,EXARCHOS T P,EXARCHOS K P,et al.Ma-chine learning applications in cancer prognosis and prediction[J].Computational and Structural Biotechnology Journal,2015,13:8-17.
[11]CARLINI N,LIU C,ERLINGSSON Ú,et al.The secret sharer:Evaluating and testing unintended memorization in neural networks[C]//28th USENIX Security Symposium(USENIX Security 19).2019:267-284.
[12]SONG C,RISTENPART T,SHMATIKOV V.Machine lear-ning models that remember too much[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security(CCS 17).2017:587-601.
[13]LEINO K,FREDRIKSON M.Stolen memories:Leveragingmodel memorization for calibrated white-box membership inference[C]//29th USENIX Security Symposium(USENIX Security 20).2020:1605-1622.
[14]TRAMÈR F,ZHANG F,JUELS A,et al.Stealing machinelearning models via prediction apis[C]//25th USENIX Security Symposium(USENIX Security 16).2016:601-618.
[15]OH S J,AUGUSTIN M,FRITZ M,et al.Towards reverse-engineering black-box neural networks[C]//2018 International Conference on Learning Representations(ICLR).2018:1-20.
[16]YU H,YANG K,ZHANG T,et al.Cloudleak:Large-scale deep learning models stealing through adversarial examples[C]//2020 Network and Distributed System Security Symposium(NDSS).2020:1-16.
[17]FREDRIKSON M,JHA S,RISTENPART T.Model inversionattacks that exploit confidence information and basic countermeasures[C]//2015 ACMSIGSAC Conference on Computer and Communications Security(CCS 15).2015:1322-1333.
[18]ZHANG Y,JIA R,PEI H,et al.The secret revealer:Generative model-inversion attacks against deep neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR).2020:250-258.
[19]MEHNAZ S,LI N,BERTINO E.Black-box model inversion attribute inference attacks on classification models[J].arXiv:2012.03404,2020.
[20]SHOKRI R,STRONATI M,SONG C,et al.MembershipInfe-rence Attacks against Machine Learning Models[C]//2017 IEEE Symposium on Security and Privacy(SP).2017:3-18.
[21]SALEM A,ZHANG Y,HUMBERT M,et al.Ml-leaks:Model and data independent membership inference attacks and defenses on machine learning models[C]//2019 Network and Distributed Systems Security(NDSS) Symposium.2019:1-15.
[22]YEOM S,GIACOMELLI I,FREDRIKSON M,et al.Privacyrisk in machine learning:Analyzing the connection to overfitting[C]//2018 IEEE 31st Computer Security Foundations Sympo-sium(CSF).2018:268-282.
[23]GERUM R C,ERPENBECK A,KRAUSS P,et al.Sparsitythrough evolutionary pruning prevents neuronal networks from overfitting[J].Neural Networks,2020,128:305-312.
[24]SONG X,JIANG Y,TU S,et al.Observational overfitting in reinforcement learning[C]//2020 International Conference on Learning Representations(ICLR).2020:1-29.
[25]RICE L,WONG E,KOLTER Z.Overfitting in adversarially robust deep learning[C]//The 37th International Conference on Machine Learning(ICML).2020:8093-8104.
[26]CHOQUETTE-CHOOCA C A,TRAMER F,CARLINI N,et al.Label-only membership inference attacks[C]//The 38th International Conference on Machine Learning(ICML).2021:1964-1974.
[27]KRIZHEVSKY A,HINTON G,et al.Learning multiple layers offeatures from tiny images[R].Technical report,University of Toronto,2009.
[28]NASR M,SHOKRI R,HOUMANSADR A.Comprehensive privacy analysis of deep learning:Passive and active white-box inference attacks against centralized and federated learning[C]//2019 IEEE Symposium on Security and Privacy(SP).2019:739-753.
[29]HUI B,YANG Y,YUAN H,et al.Practical blind membership inference attack via differential comparisons[C]//Network and Distributed Systems Security(NDSS) Symposium.2019:1-17.
[30]DWORK C,MCSHERRY F,NISSIM K,et al.Calibrating noise to sensitivity in private data analysis[J].Journal of Privacy and Confidentiality,2017,7(3):17-51.
[31]SABLAYROLLES A,DOUZE M,SCHMID C,et al.White-box vs black-box:Bayes optimal strategies for membership inference[C]//The 36th International Conference on Machine Learning(ICML).2019:5558-5567.
[32]ABADI M,CHU A,GOOD-FELLOW I,et al.Deep learningwith differential privacy[C]//2016 ACMSIGSAC Conference on Computer and Communications Security(CCS 16).2016:308-318.
[33]NASR M,SHOKRI R,HOUMANSADR A.Machine learningwith membership privacy using adversarial regularization[C]//2018 ACM SIGSAC Conference on Computer and Communications Security(CCS 18).2018:634-646.
[34]SONG L W,MITTAL P.Systematic evaluation of privacy risks of machine learning models[C]//30th USENIX Security Symposium(USENIX Security 21).2021:2615-2632.
[35]JIA J,SALEM A,BACKES M,et al.Memguard:Defendingagainst black-box membership inference attacks via adversarial examples[C]//2019 ACM SIGSAC Conference on Computer and Communications Security(CCS 19).2019:259-274.
[36]PAPERNOT N,MCDANIEL P,GOODFELLOW I,et al.Practical black-box attacks against machine learning[C]//2017 ACM on Asia Conference on Computer and Communications Security(AsiaCCS 17).2017:506-519.
[37]CARLINI N,WAGNER D.Towards evaluating the robustness of neural networks[C]//2017 IEEE Symposium on Security and Privacy(SP).2017:39-57.
[38]TRAMèR F,KURAKIN A,PAPERNOT N,et al.Ensemble adversarial training:Attacks and defenses[C]//2018 International Conference on Learning Representations(ICLR).2018:1-20.
[39]HINTON G,VINYALS O,DEAN J.Distilling the knowledge in a neural network[J].arXiv:1503.02531,2015.
[40]DU S,YOU S,LI X,et al.Agree to disagree:Adaptive ensemble knowledge distillation in gradient space[C]//2020 Advances in Neural Information Processing Systems(NeurIPS 20).2020:12345-12355.
[41]SHEJWALKAR V,HOUMANSADR A.Membership privacyfor machine learning models through knowledge transfer[C]//2021 AAAI Conference on Artificial Intelligence.2021:9549-9557.
[42]TRUEX S,LIU L,GURSOY M,et al.Demystifying membership inference attacks in machine learning as a service[J].IEEE Transactions on Services Computing,2021,14(6):2073-2089.
[43]GOODFELLOW I,POUGET-ABADIE J,MIRZA M,et al.Ge-nerative adversarial nets[C]//2014 Advances in Neural Information Processing Systems(NeurIPS 14).2014:1-9.
[44]CHO J H,HARIHARAN B.On the efficacy of knowledge distillation[C]//2019 IEEE/CVF International Conference on Computer Vision(ICCV).2019:4793-4801.
[1] XU Xia, ZHANG Hui, YANG Chunming, LI Bo, ZHAO Xujian. Fair Method for Spectral Clustering to Improve Intra-cluster Fairness [J]. Computer Science, 2023, 50(2): 158-165.
[2] WANG Yitan, WANG Yishu, YUAN Ye. Survey of Learned Index [J]. Computer Science, 2023, 50(1): 1-8.
[3] XU Miaomiao, CHEN Zhenping. Incentive Mechanism for Continuous Crowd Sensing Based Symmetric Encryption and Double Truth Discovery [J]. Computer Science, 2023, 50(1): 294-301.
[4] CHEN Depeng, LIU Xiao, CUI Jie, HE Daojing. Survey of Membership Inference Attacks for Machine Learning [J]. Computer Science, 2023, 50(1): 302-317.
[5] LU Chen-yang, DENG Su, MA Wu-bin, WU Ya-hui, ZHOU Hao-hao. Federated Learning Based on Stratified Sampling Optimization for Heterogeneous Clients [J]. Computer Science, 2022, 49(9): 183-193.
[6] LENG Dian-dian, DU Peng, CHEN Jian-ting, XIANG Yang. Automated Container Terminal Oriented Travel Time Estimation of AGV [J]. Computer Science, 2022, 49(9): 208-214.
[7] NING Han-yang, MA Miao, YANG Bo, LIU Shi-chang. Research Progress and Analysis on Intelligent Cryptology [J]. Computer Science, 2022, 49(9): 288-296.
[8] HE Qiang, YIN Zhen-yu, HUANG Min, WANG Xing-wei, WANG Yuan-tian, CUI Shuo, ZHAO Yong. Survey of Influence Analysis of Evolutionary Network Based on Big Data [J]. Computer Science, 2022, 49(8): 1-11.
[9] LI Yao, LI Tao, LI Qi-fan, LIANG Jia-rui, Ibegbu Nnamdi JULIAN, CHEN Jun-jie, GUO Hao. Construction and Multi-feature Fusion Classification Research Based on Multi-scale Sparse Brain Functional Hyper-network [J]. Computer Science, 2022, 49(8): 257-266.
[10] ZHANG Guang-hua, GAO Tian-jiao, CHEN Zhen-guo, YU Nai-wen. Study on Malware Classification Based on N-Gram Static Analysis Technology [J]. Computer Science, 2022, 49(8): 336-343.
[11] CHEN Ming-xin, ZHANG Jun-bo, LI Tian-rui. Survey on Attacks and Defenses in Federated Learning [J]. Computer Science, 2022, 49(7): 310-323.
[12] LI Ya-ru, ZHANG Yu-lai, WANG Jia-chen. Survey on Bayesian Optimization Methods for Hyper-parameter Tuning [J]. Computer Science, 2022, 49(6A): 86-92.
[13] ZHAO Lu, YUAN Li-ming, HAO Kun. Review of Multi-instance Learning Algorithms [J]. Computer Science, 2022, 49(6A): 93-99.
[14] XIAO Zhi-hong, HAN Ye-tong, ZOU Yong-pan. Study on Activity Recognition Based on Multi-source Data and Logical Reasoning [J]. Computer Science, 2022, 49(6A): 397-406.
[15] YAO Ye, ZHU Yi-an, QIAN Liang, JIA Yao, ZHANG Li-xiang, LIU Rui-liang. Android Malware Detection Method Based on Heterogeneous Model Fusion [J]. Computer Science, 2022, 49(6A): 508-515.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!