计算机科学 ›› 2023, Vol. 50 ›› Issue (11A): 230800021-8.doi: 10.11896/jsjkx.230800021

• 信息安全 • 上一篇    下一篇

一种面向多模态医疗数据的联邦学习隐私保护方法

张连福1, 谭作文2   

  1. 1 宜春学院数学与计算机科学学院 江西 宜春 336000
    2 江西财经大学信息管理学院计算机科学与技术系 南昌 330032
  • 发布日期:2023-11-09
  • 通讯作者: 谭作文(tanzyw@163.com)
  • 作者简介:(zlf_jx@163.com)
  • 基金资助:
    国家自然科学基金(62362036);江西省自然科学基金重点项目(20232ACB202012)

Federated Learning Privacy-preserving Approach for Multimodal Medical Data

ZHANG Lianfu1, TAN Zuowen2   

  1. 1 College of Mathematics and Computational Science,Yichun University,Yichun,Jiangxi 336000,China
    2 Department of Computer Science and Technology,School of Information Technology,Jiangxi University of Finance and Economics,Nanchang 330032,China
  • Published:2023-11-09
  • About author:ZHANG Lianfu,born in 1978,Ph.D,lecturer,is a member of China Computer Federation.His main research interests include information security and privacy-preserving machine learning.
    TAN Zuowen,born in 1967,Ph.D,professor,Ph.D supervisor,is a member of China Computer Federation.His main research interests include cryptography,blockchain and privacy-preserving machine learning.
  • Supported by:
    National Natural Science Foundation of China(62362036) and Key Project of Jiangxi Provincial Natural Science Foundation(20232ACB202012).

摘要: 电子健康记录(Electronic Health Records,EHRs)数据已成为生物医学研究的宝贵资源。通过学习隐藏在EHRs数据中的人类难以区分的多维特征,机器学习方法可以获得更好的结果。然而,现有的一些研究只考虑了模型训练过程中或模型训练后可能面临的一些隐私泄露,导致隐私防护措施单一,无法实现覆盖机器学习全生命周期。此外,现有的方案大多是针对单模态数据的联邦学习隐私保护方法的研究。因此,提出了一种面向多模态数据的联邦学习隐私保护方法。为防止敌手通过反向攻击窃取原始数据信息,对每个参与者上传的模型参数进行差分隐私扰动。为防止在模型训练过程中各参与方的局部模型信息泄露,利用Paillier密码系统对局部模型参数进行同态加密。从理论的角度对该方法进行了安全性分析,给出了安全模型定义,并证明了子协议的安全性。实验结果表明,该方法在几乎不损失性能的情况下,保护了训练数据和模型的隐私。

关键词: 联邦学习, 多模态数据, 电子健康记录, 安全聚合, 隐私保护

Abstract: Electronic health records(EHRs) data has become a valuable resource for biomedical research.By learning multi-dimensional features hidden in EHRs data that are difficult for humans to distinguish,machine learning methods can achieve better results.However,some existing studies only consider some privacy leaks that may be faced during or after model training,resulting in a single privacy preservation measure that cannot cover the whole life cycle of machine learning.In addition,most of the existing programs are focused on federated learning privacy preservationmethods for single-mode data.Therefore,a federated learningprivacy preservation approach for multimodal data is proposed.To prevent the adversaryfrom stealing the original data information through reverse attack,differential privacy perturbation is performed on the model parameters uploaded by each participant.To prevent the leakage of local model information of each participant in the process of model training,the Paillier cryptosystem is used for homomorphic encryption of local model parameters.The security of the method is analyzed from the theoretical point of view,the security model is defined,and the security of the subprotocol is proved.Experimental results show that this method can preserveprivacy of training data and model with almost no loss of performance.

Key words: Federated learning, Multimodal data, EHRs, Secure aggregation, Privacy-preserving

中图分类号: 

  • TP391
[1]MCMAHAN B,MOORE E,RAMAGE D,et al.Communication-efficient learning of deep networks from decentralized data[C]//Proceedings of the Artificial Intelligence and Statistics.2017:1273-1282.
[2]TAN Z W,ZHANG L F.Survey on privacy preserving techniques for machine learning[J].J Software,2020,31(7):2127-2156.
[3]LONG G,SHEN T,TAN Y,et al.Federated learning for privacy-preserving open innovation future on digital health[M].Springer,2022:113-133.
[4]ADNAN M,KALRA S,CRESSWELL J C,et al.Federatedlearning and differential privacy for medical image analysis[J].Scientific Reports,2022,12(1):1-10.
[5]SONG C,RISTENPART T,SHMATIKOV V.Machine lear-ning models that remember too much[C]//Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security.2017:587-601.
[6]ATENIESE G,MANCINI LV,SPOGNARDI A,et al.Hacking smart machines with smarter ones:How to extract meaningful data from machine learning classifiers[J].International Journal of Security and Networks,2015,10(3):137-150.
[7]LEE J,SUN J,WANG F,et al.Privacy-preserving patient similarity learning in a federated environment:development and analysis[J].JMIR Medical Informatics,2018,6(2):e7744.
[8]WIBAWA F,CATAK FO,KUZLU M,et al.Homomorphic encryption and federated learning based privacy-preserving cnn training:COVID-19 detection use-case[C]//Proceedings of the 2022 European Interdisciplinary Cybersecurity Conference.2022:85-90.
[9]ZHANG L F,XU J,VIJAYAKUMAR P,et al.Homomorphic encryption-based privacy-preserving federated learning in iot-enabled healthcare system[J].IEEE Transactions on Network Science and Engineering,2022,2022(1):1-17.
[10]KALAPAAKING A P,STEPHANIE V,KHALIL I,et al.SMPC-Based federated learning for 6G-enabled internet of medical things[J].IEEE Network,2022,36(4):182-189.
[11]HOSSEINI S M,SIKAROUDI M,BABAEI M,et al.Cluster based secure multi-party computation in federated learning for histopathology images[C]//Proceedings of the International Workshop on Distributed,Collaborative,and Federated Learning,Workshop on Affordable Healthcare and AI for Resource Diverse Global Health.Springer,2022:110-118.
[12]CHOUDHURY O,GKOULALAS-DIVANIS A,SALONIDIST,et al.Differential privacy-enabled federated learning for sensitive health data[J].arXiv:1910.02578,2019.
[13]AL AZIZ M M,ANJUM M M,MOHAMMED N,et al.Generalized genomic data sharing for differentially private federated learning[J].Journal of Biomedical Informatics,2022,132:104113.
[14]ISLAM T U,GHASEMI R,MOHAMMED N.Privacy-preserving federated learning model for healthcare data[C]//Procee-dings of the 2022 IEEE 12th Annual Computing and Communication Workshop and Conference(CCWC).IEEE,2022:281-287.
[15]ZHANG W,ZHOU T,LU Q,et al.Dynamic-fusion-based federated learning for COVID-19 detection[J].IEEE Internet of Things Journal,2021,8(21):15884-15891.
[16]DONG Y,HOU W,CHEN X J,et al.Efficient and Secure Federated Learning Based on Secret Sharing and Gradients Selection[J].Journal of Computer Research and Development,2020,57(10):2241-2250.
[17]YANG J,SHI R,WEI D,et al.MedMNIST v2-A large-scale lightweight benchmark for 2D and 3D biomedical image classification[J].Scientific Data,2023,10(1):41.
[18]SABRY F,ELTARAS T,LABDA W,et al.Machine learning for healthcare wearable devices:the big picture[J/OL].https://doi.org/10.1155/2022/4653923.
[19]AONO Y,HAYASHI T,WANG L,et al.Privacy-preservingdeep learning via additively homomorphic encryption[J].IEEE Transactions on Information Forensics and Security,2017,13(5):1333-1345.
[20]LU D,SHI M,MA X,et al.Smaug:A TEE-Assisted secured SQLite for embedded systems[J].IEEE Transactions on Dependable and Secure Computing,2023,20(5):3617-3635.
[21]STEPHANIE V,KHALIL I,ATIQUZZAMAN M,et al.Trustworthy privacy-preserving hierarchical ensemble and federated learning in healthcare 4.0 with blockchain[J].IEEE Transactions on Industrial Informatics,2023,19(7):7936-7945.
[22]MENG X,YANG Y,LIU X,et al.Active forgetting via influence estimation for neural networks[J].International Journal of Intelligent Systems,2022,37(11):9080-9107.
[23]MOHAMMED S J,TAHA D B.Performance evaluation ofRSA,ElGamal,and Paillier partial homomorphic encryption algorithms[C]//Proceedings of the 2022 International Conference on Computer Science and Software Engineering(CSASE)IEEE,2022:89-94.
[24]BOGDANOV D,LAUR S,WILLEMSON J.Sharemind:A frame-work for fast privacy-preserving computations[C]//Proceedings of the Computer Security-ESORICS 2008:13th European Symposium on Research in Computer Security.Málaga,Spain,Springer,2008:192-206.
[25]TRIASTCYN A,FALTINGS B.Federated learning with bayesian differential privacy[C]//Proceedings of the 2019 IEEE International Conference on Big Data(Big Data).IEEE,2019:2587-2596.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!